ALGORITHMS AND COMPUTATION 
IN MATHEMATICS 


Polynomials 


Z) Springer 


11 


Algorithms and Computation 
in Mathematics * Volume 11 


Editors 


Manuel Bronstein Arjeh M. Cohen 
HenriCohen David Eisenbud 
Bernd Sturmfels 


Victor V. Prasolov 


Polynomials 


Translated from the Russian by Dimitry Leites 


Q) Springer 


Victor V. Prasolov 


Independent University of Moscow 
Department Mathematics 

Bolshoy Vlasievskij per.11 

119002 Moscow, Russia 

e-mail: prasolov@mccme.ru 


Dimitry Leites (Translator) 


Stockholm University 
Department of Mathematics 
106 91 Stockholm, Sweden 
e-mail: mleites@math.su.se 


Originally published by MCCME 
Moscow Center for Continuous Math. Education 
in 2001 (Second Edition) 


Mathematics Subject Classification (2000): 12-XX, 12E05 


Library of Congress Control Number: 2009935697 


ISSN 1431-1550 

ISBN 978-3-540-40714-0 (hardcover) e-ISBN 978-3-642-03980-5 
ISBN 978-3-642-03979-9 (softcover) 

DOI 10.1007/978-3-642-03980-5 

This work is subject to copyright. All rights are reserved, whether the whole or part of the material 
is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, 
broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication 
of this publication or parts thereof is permitted only under the provisions of the German Copyright 


Law of September 9, 1965, in its current version, and permission for use must always be obtained 
from Springer. Violations are liable for prosecution under the German Copyright Law. 


Springer is a part of Springer Science+Business Media 
springeronline.com 
© Springer-Verlag Berlin Heidelberg 2004, First softcover printing 2010 


Printed in Germany 


The use of general descriptive names, registered names, trademarks, etc. in this publication does not 
imply, even in the absence of a specific statement, that such names are exempt from the relevant pro- 
tective laws and regulations and therefore free for general use. 


Typeset by the translator. 
Edited and reformatted by LE-TeX, Leipzig, using a Springer ETpX macro package. 
Cover design: deblik, Berlin 


Printed on acid-free paper 


Preface 


The theory of polynomials constitutes an essential part of university courses 
of algebra and calculus. Nevertheless, there are very few books entirely de- 
voted to this theory.t Though, after the first Russian edition of this book was 
printed, there appeared several books? devoted to particular aspects of the 
polynomial theory, they have almost no intersection with this book. 


' The following classical references (not translated into Russian and therefore not 
mentioned in the Russian editions of this book) are rare exceptions: 
Barbeau E. J., Polynomials. Corrected reprint of the 1989 original. Problem 
Books in Mathematics. Springer-Verlag, New York, 1995. xxii+455 pp.; 
Borwein P., Erdélyi T., Polynomials and polynomial inequalities. Graduate 
Texts in Mathematics, 161. Springer-Verlag, New York, 1995. x+480 pp.; 
Obreschkoff N., Verteilung und Berechnung der Nullstellen reeller Polynome. 
(German) VEB Deutscher Verlag der Wissenschaften, Berlin 1963. viii+298 pp. 


For example, some recent ones: Macdonald I. G., Affine Hecke algebras and orthog- 
onal polynomials. Cambridge Tracts in Mathematics, 157. Cambridge University 
Press, Cambridge, 2003. x-+175 pp.; 

Phillips G. M., Interpolation and approximation by polynomials. CMS Books in 
Mathematics/Ouvrages de Mathématiques de la SMC, 14. Springer-Verlag, New 
York, 2003. xiv+312 pp.; 

Mason J. C., Handscomb D. C. Chebyshev polynomials. Chapman & Hall/CRC, 
Boca Raton, FL, 2003. xiv+341 pp.; 

Rahman Q. I., Schmeisser G., Analytic theory of polynomials, London Math. 
Soc. Monographs (N.S.) 26, 2002; 

Sheil-Small T., Complex polynomials, Cambridge studies in adv. math. 75, 
2002; 

Lomont J. S., Brillhart J., Elliptic polynomials. Chapman & Hall/CRC, Boca 
Raton, FL, 2001. xxiv+289 pp.; 

Krall A. M., Hilbert space, boundary value problems and orthogonal polynomi- 
als. Operator Theory: Advances and Applications, 133. Birkhauser Verlag, Basel, 
2002. xiv+352 pp.; 

Dunk! Ch. F., Xu Yuan, Orthogonal polynomials of several variables. Ency- 
clopedia of Mathematics and its Applications, 81. Cambridge University Press, 
Cambridge, 2001. xvi+390 pp. (Hereafter the translator’s footnotes.) 


VI Preface 


This book contains an exposition of the main results in the theory of 
polynomials, both classical and modern. Considerable attention is given to 
Hilbert’s 17th problem on the representation of non-negative polynomials by 
the sums of squares of rational functions and its generalizations. Galois theory 
is discussed primarily from the point of view of the theory of polynomials, not 
from that of the general theory of fields and their extensions. More precisely: 

In Chapter 1 we discuss, mostly classical, theorems about the distribution 
of the roots of a polynomial and of its derivative. It is also shown how to 
determine the number of real roots to a real polynomial, and how to separate 
them. 

Chapter 2 deals with irreducibility criterions for polynomials with integer 
coefficients, and with algorithms for factorization of such polynomials and for 
polynomials with coefficients in the integers mod p. 

In Chapter 3 we introduce and study some special classes of polynomials: 
symmetric (polynomials which are invariant when the indeterminates are per- 
muted), integer valued (polynomials which attain integer values at all integer 
points), cyclotomic (polynomials with all primitive nth roots of unity as roots), 
and some interesting classes introduced by Chebyshev, and by Bernoulli. 

In Chapter 4 we collect a lot of scattered results on properties of polyno- 
mials. We discuss, e.g., how to construct polynomials with prescribed values 
in certain points (interpolation), how to represent a polynomial as a sum 
of powers of polynomials of degree one, and give a construction of numbers 
which are not roots of any polynomial with rational coefficients (transcenden- 
tal numbers). 

Chapter 5 is devoted to the classical Galois theory. It is well known that 
the roots of a polynomial equation of degree at most four in one variable can 
be expressed in terms of radicals of arithmetic expressions of its coefficients. 
A main application of Galois theory is that this is not possible in general for 
equations of degree five or higher. 

In Chapter 6 three classical Hilbert’s theorems are given: an ideal in a 
polynomial ring has a finite basis (Hilbert’s basis theorem); if a polynomial f 
vanishes on all common zeros of f1,..., f;, then some power of f is a linear 
combination (with polynomial coefficients) of f1,..., f, (Hilbert’s Nullstellen- 
satz); and if M = @M,; is a finitely generated module over a polynomial ring 
over K, then dimyx M; is a polynomial in i for large i (the Hilbert polynomial 
of M). 

Furthermore, the theory of Grobner bases is introduced. Grobner bases 
are a tool for calculations in polynomial rings. An application is that solv- 
ing systems of polynomial equations in several variables with finitely many 
solutions can be reduced to solving polynomial equations in one variable. 

In the final Chapter 7 considerable attention is given to Hilbert’s 17th 
problem on the representation of non-negative polynomials as the sum of 
squares of rational functions, and to its generalizations. The Lenstra-Lenstra- 
Lovasz algorithm for factorization of polynomials with integer coefficients is 
discussed in an appendix. 


Preface Vil 


Two important results of the theory of polynomials whose exposition re- 
quires quite a lot of space did not enter the book: how to solve fifth degree 
equations by means of theta functions, and the classification of commuting 
polynomials. These results are expounded in detail in two recently published 
books in which I directly participated: [Pr3] and [Pr4]. 

During the work on this book I received financial support from the Russian 
Fund of Basic Research under Project No. 01-01-00660. 


Acknowledgement. Together with the translator, I am thankful to Dr. 
Eastham for meticulous and friendly editing of the English and mathematics, 
to J. Borcea, R. Fréberg, B. Shapiro and V. Kostov for useful comments. 


V. Prasolov 


Moscow, May 1999 
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Notational conventions 


As usual, Z denotes the set of all integers, N the subset of positive integers, 
F, = Z/pZ for p prime. 

(Z/nZ)* denotes the set of invertible elements of Z/nZ. 

|S| denotes the cardinality of the set S. 

R{x] denotes the ring of polynomials in one indeterminate x with coeffi- 
cients in a commutative ring R. 

[x] denotes the integer part of a given real number 2, i.e., the greatest 
integer which is < x. 


Numbering of Theorems, Lemmas and Examples is usually continuous 
throughout each section, e.g., reference to Lemma 2.3.2 means that the Lemma 
is to be found in subsection 2.3 inside the same chapter 2. 

Subsections are numbered separately, so Theorem 2.3.4 may occure in 
subsec. 2.3.2. 

Certain Lemmas and Examples (considered of local importance) are num- 
bered simply Lemma 1, and so on, and, to find it, the page is indicated in the 
reference. 
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Roots of Polynomials 


1.1 Inequalities for roots 


1.1.1 The Fundamental Theorem of Algebra 


In olden times, when algebraic theorems were scanty, the following statement 
received the title of the Fundamental Theorem of Algebra: 


“A given polynomial of degree n with complex coefficients has exactly n 
roots (multiplicities counted).” 


The first to formulate this statement was Alber de Girard in 1629, but he 
did not even try to prove it. The first to realize the necessity of proving the 
Fundamental Theorem of Algebra was d’Alembert. His proof (1746) was not, 
however, considered convincing. Euler (1749), Faunsenet (1759) and Lagrange 
(1771) offered their proofs but these proofs were not without blemishes, either. 

The first to give a satisfactory proof of the Fundamental Theorem of Al- 
gebra was Gauss. He gave three different versions of the proof (1799, 1815 
and 1816) and in 1845 he additionally published a refined version of his first 
proof. 

For a review of the different proofs of the Fundamental Theorem of Alge- 
bra, see [Ti]. We confine ourselves to one proof. This proof is based on the 
following Rouché’s theorem, which is of interest by itself. 


Theorem 1.1.1 (Rouché). Let f and g be polynomials, and y a closed curve 
without self-intersections in the complex plane!. If 


f(z) — 9(2)| < |F@| + |9@)| (1) 


for all z © y, then inside y there is an equal number of roots of f and g 
(multiplicities counted). 


' The plane C! of complex variable. 


V.V. Prasolov, Polynomials, Algorithms and Computation in Mathematics 11, 1 
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Proof. In the complex plane, consider vector fields v(z) = f(z) and 
w(z) = g(z). From (1) it follows that at no point of y are the vectors v 
and w directed opposite to each other. Recall that the index of the curve y 
with respect to a vector field v is the number of revolutions of the vector v(z) 
as it completely circumscribes the curve +. (For a more detailed acquaintance 
with the properties of index we recommend Chapter 6 of [Pr2].) Consider the 
vector field 

u, = tu + (1—t)w. 


Then vo = w and v; = v. It is also clear that at every point z € y the vector 
vz(z) is nonzero. This means that the index ind(t) of y with respect to the 
vector field v, is well defined. The integer ind(t) depends continuously on ¢, 
and hence ind(t) = const. In particular, the indices of y with respect to the 
vector fields v and w coincide. 

Let the index of the singular point zo be defined as the index of the curve 
|z — zo| = €, where ¢ is sufficiently small. It is not difficult to show that the 
index of y with respect to a vector field v is equal to the sum of indices of 
singular points, i.e., those at which v(z) = 0. For the vector field u(z) = f(z), 
the index of the singular point zp is equal to the multiplicity of the root zo of 
f. Therefore the coincidence of the indices of y with respect to vector fields 
u(z) = f(z) and w(z) = g(z) implies that, inside y, the number of roots of f 
is equal to that of g. 


With the help of Rouché’s theorem it is not only possible to prove the 
Fundamental Theorem of Algebra but also to estimate the absolute value of 
any root of the polynomial in question. 


Theorem 1.1.2. Let f(z) = 2? +a,z"-1+-+-+4n, where a; € C. Then, 
inside the circle |z| = 1+max|a;|, there are exactly n roots of f (multiplicities 
a 


counted). 


Proof. Let a = max |a;|. Inside the circle considered, the polynomial 
g(z) = 2” has root 0 of multiplicity n. Therefore it suffices to verify that, 
if |z| = 1 +a, then | f(z) — g(z)| < |f(z)| + |g(z)|. We will prove even that 
|f(z) — g(2)| < |9(2)), ie., 

laze"? +~---+a,| < |2|". 
Clearly, if |z| = 1+ a, then 


Jz" —1 
= Oe 
|z|—1 


Jarz™* +++ +n] <all2|""* +--+ +1) =|z/"—1<|z|". 


1.1.2 Cauchy’s theorem 


Here we discuss Cauchy’s theorem on the roots of polynomials as well as its 
corollaries and generalizations. 
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Theorem 1.1.3 (Cauchy). Let f(x) = x” — bja"~1 —--- — bn, where all 
the numbers b; are non-negative and at least one of them is nonzero. The 
polynomial f has a unique (simple) positive root p and the absolute values of 
the other roots do not exceed p. 


Proof. Set 
Fa) sf Noe de op 


Bie ay Me 


If « £ 0, the equation f(x) = 0 is equivalent to the equation F(x) = 0. As 
x grows from 0 to +00 the function F'(a) strictly decreases from +00 to —1. 
Therefore, for x > 0, the function F' vanishes at precisely one point, p. We 
have 


PP) _ py) = — 
pr P pe — - prt 
Hence p is a simple root of f. 

It remains to prove that if zo is a root of f, then q = |xo| < p. Suppose 
that q > p. Then, since F' is monotonic, it follows that q > p,ie., f(q) > 0. 
On the other hand, the equality 2? = bia}! +.---+ bn implies that 


q’ Ss bigh | ai bn, 


ie., f(g) < 0, which is a contradiction. 


Remark. Cauchy’s theorem is directly related to the Perron-Frobenius 
theorem on non-negative matrices (cf. [Wil]). 


The polynomial x?” — 2” — 1 has n roots whose absolute values are equal 
to the value of the positive root of this polynomial. Therefore, in Cauchy’s 
theorem, the estimate 


the absolute values of the roots are < p 
cannot, in general, be replaced by the estimate 
the absolute values of the roots are < p. 


Ostrovsky showed, nevertheless, that in a sufficiently general situation such 
a replacement is possible. 


Theorem 1.1.4 (Ostrovsky). Let f(x) =a" —b,a"~!—-++—bn, where all 
the numbers b; are non-negative and at least one of them is nonzero. 

If the greatest common divisor of the indices of the positive coefficients b, 
is equal to 1, then f has a unique positive root p and the absolute values of 
the other roots are < p. 
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Proof. Let only the coefficients by,, be., ---, Dk, Where ky < kg < +++ < 
ky», be positive. Since the greatest common divisor of ki,..., km is equal to 
1, there exist integers s1,...,5, such that sjk, + ---+ 8mkm = 1. Consider 
again the function 

bry Phen 


Bays dncack 


= iL 
aki gkm 


The equation F(x) = 0 has a unique positive solution p. Let « be any other 
(nonzero) root of f. Set g = |x|. Then 


Diem 
ahkm 


Oe Ohm 


q't qkm’ 


ie., F(q) > 0. We see that the equality F(q) = 0 is only possible if 


Di 
whi 


> 0 for all 7. 


But in this case 


bee BRE (be \ ( bim "Sg 
ee | a cr >0, 


i.e., x > 0. This contradicts the fact that x 4 p and p is the only positive root 
of the equation F(x) = 0. Thus F'(q) > 0. Therefore, since F(x) is monotonic 
for positive x, it follows that q < p. 


The Cauchy-Ostrovsky theorem implies the following estimate of the ab- 
solute value of the roots of polynomials with positive coefficients. 


Theorem 1.1.5. a) (Enestr6ém-Kakeya) If all the coefficients of the polyno- 
mial g(x) = apv”~!+--++an_1 are positive, then, for any root € of this 
polynomial, we have 


. i Qi 
min —+=6<|é|<y=_ max —+}. 
1<i<n-—1 | Qj_1 1<i<n—1 | Qj_1 


b) (Ostrovsky) Let “*- < y fork = ki,...,km. If the greatest common 


Ak-1 


divisor of the numbers n,ky,...,km is equal to 1, then |§| < +. 


Proof. Consider the polynomial 


(x — y)g(z) = agz” — (yan — aa"? — +++ — (Yan—2 — Gn—1)@ — YGn-1- 


ay 
aj-1 


By definition, y > , Le., yaj-1 — a; => 0. Therefore, by Cauchy’s theorem, 
y is the only positive root of the polynomial (x — y)g(a) and the absolute 


values of the other roots of this polynomial are < ¥. 
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1 
If € is a root of g, then 7 = — is a root of an_yy"~!+---+ a. Hence 


I¢| 
1 


? 
min { ai \ 
1<i<n—-1 (4-1 


ig el Oe 


hm 
—S 
Q 
oT 
fa 
ee, 
lI 


i.e., 


\§§ >d= min { = I 
1<i<n—-1 | Gj_-1 


If condition b) is satisfied, the root y of the polynomial (a — y)g(a) is 
strictly greater than the absolute values of the other roots of this polynomial. 


Remark. The Enestr6m-Kakeya theorem is also related to the Perron- 
Frobenius theorem, cf. [An2]. 


An essential generalization of the Enestrém-Kakeya theorem is obtained in 
[Gal]. However, the formulation of this generalization is rather cumbersome, 
and therefore we do not give it here. 


1.1.3 Laguerre’s theorem 


Let 21,--.,2n € C be points of unit mass. The point ¢ = +(z1 +--+ + Zn) is 
called the center of mass of 21,..., Zn. 
This notion can be generalized as follows. Perform a fractional-linear trans- 
formation w that sends zo to ov, Le., 
a 


w(z) = eam 


Let us find the center of mass of the images of z1,...,Zn and then apply 
the inverse transformation w~!. Simple calculations show that the result does 
not depend on a and 6, namely, we obtain the point 


1 
C29 = 20 + NF (1) 
Z1—Z0 7%, + Zn —20 
which is called the center of mass of 21,...,2n with respect to Zo. 
Clearly, 
the center of mass of z1,..., Zn lies inside their convex hull. 


This statement easily generalizes to the case of the center of mass with 
respect to zo. One only has to replace the lines that connect the points z; and 
z; by circles passing through z;, z; and z. The point z corresponding to oo 
lies outside the convex hull. 
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Theorem 1.1.6. Let f(z) = (2-21) :...:(@—2n). Then the center of mass 
of the roots of f with respect to an arbitrary point z is given by the formula 


_, £2) 
camer 0) 


Proof. Clearly 
: 1 1 
Ca 


f(z) 2-21 t= 2m 


The desired statement follows directly from formula (1). 


Theorem 1.1.7 (Laguerre). Let f(z) be a polynomial of degree n and x its 
simple root. Then the center of mass of all the other roots of f(z) with respect 
to x is the point 

f(a) 


f" (2) 

Proof. Let f(z) = (2 — x)F(z). Then f’(z) = F(z) + (z — #)F"(z) and 
ff" (2) = 2F'(z) + (zg — «)F"(z). Therefore f’(x) = F(a) and f(x) = 2F’(2). 
Applying the preceding theorem to the polynomial F’ of degree n — 1, and 
point z = x, we obtain the desired statement. 


X =x —2(n—1) 


Theorem 1.1.8 (Laguerre). Let f(z) be a polynomial of degree n and 
fo) 

f"(2) 

Let the circle (or line) C pass through a simple root z, of f and the other 


roots of f belong to one of the two domains into which C divides the plane. 
Then X(z1) also belongs to the same domain. 


X(z) = z-—2(n—1) 


Proof. In the case of the “usual” center of mass, the circle C corresponds 
to the line such that all the roots of f(z), except 21, lie on one side of it. The 
center of mass of these roots lies on the same side of this line. 


Corollary. Let z, be one of the simple roots of f with the maximal ab- 
solute value. Then |X(z1)| < |z1|, ae., 


f(a) 
f(a) 


Proof. All the roots of f lie in the disk {z € C | |z| < |z1|}, and therefore 
X(z1) also belongs to this disk. 


zy — 2(n—1) 


< |z1- 


Theorem 1.1.9. Let f be a polynomial with real coefficients and define 
F(z) 


= 2=1 


f'(z) 
All the roots of f are real if and only if Imz-Im¢, < 0 for anyz€C\R. 
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Proof. Suppose first that all the roots of f are real. Let Imz = a > 0. 
The line consisting of the points with the imaginary part ¢, where 0 < «<a, 
separates the point z from all the roots of f since they belong to the real axis. 
Therefore Im ¢, < e. In the limit as « — 0, we obtain Im¢, < 0. 

It is easy to verify that it is impossible to have Im¢, = 0. Indeed, let 
¢, € R. Consider a circle passing through z and tangent to the real axis at ¢,. 
Slightly jiggling this circle we can construct a circle on one side of which lie 
the points z and ¢,, and on the other side lie all the roots of f. If Imz =a < 0, 
the arguments are similar. 

Now suppose that Imz-Im¢, < 0 for all z€ C\R. Let 2 be a root of f 
such that Im(z,) 4 0. Then jim ¢, = 2, and therefore Im z; -Im¢,, > 0. 

Te1 


Our presentation of Laguerre’s theory is based on the paper [Gr], see 
also [Pol]. 


1.1.4 Apolar polynomials 


Let f(z) be a polynomial of degree n and ¢ a fixed number or co. The function 


Act(e) = f — AFG) + nfl) #C# 00} 
f'(z) if ¢ = 00 
is called the derivative of f(z) with respect to point ¢. It is easy to verify that, 
if 
fo=>> @ ane®, (1) 
k=0 
then 
te Sapa .‘ 
1 H(2) = | cave (*) 
Therefore 
i feel ; 
Acta) = ("5 )lax + aneacde (2) 


Let 21,...,2n be the roots of the polynomial (1), and let ¢1,...,¢n be the 
roots of the polynomial 


Formula (2) implies that 


1 
Aci Age Aen f(2) = ao + 2101 + 4202 +°°- + anon; 


where 
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bye 
m=O+G+--4+G= (") nt 
02 = 6102 ferret Cn—1Sn = (5) — 


Hence the equality A¢, Ac, --- Ac, f(z) = 0 is equivalent to the equality 


aobn — (") aybn—1 + (5) azbn—2 + +++ + (-1)"anbo = 0. (4) 


The polynomials f and g given by (1) and (3) and whose coefficients are 
related via (4) are said to be apolar. 

A circular domain is either the inner or the exterior part of a disk or the 
half plane. 


Theorem 1.1.10 (J. H. Grace, 1902). Let f and g be apolar polynomials. 
If all the roots of f belong to a circular domain K, then at least one of the 
roots of g also belongs to Kk. 


Proof. We will need the following auxiliary statement. 


Lemma 1.1.11. Let all the roots 2,...,2n of f(z) lie inside the circular 
domain K and let ¢ lie outside K. Then all the roots of Ac f(z) lie inside K. 


Proof. Observe first that, if w; is a root of the polynomial A¢ f(z), then ¢ 
is the center of mass of the roots of f(z) with respect to w;. Indeed, if ¢ 4 co, 
then we can express the equality Ac f(w;) = 0 in the form 


f (wi) 
—w;)f'(wi) +nf(w;) =0, ie, C=u;—n . 
(6 — wi) f"(wi) + nf wi) mo 
If ¢ = 00, then f’(w;) = Ac f(w;) = 0, and hence 
n 1 7 f!(w;) 7 
pa Zi Wi f(wi) 
Therefore the center of mass of the points z1,..., 2, with respect to w; is 
situated at ; 
wtoatT =O. 


2a 


Now it is clear that point w; cannot lie outside K. Indeed, if w; were 
situated outside K, then the center of mass of 21,..., 2, with respect to w; 
would be inside K. However, this contradicts the fact that ¢ lies outside Kk. 
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With the help of Lemma 1.1.11, Theorem 1.1.10 is proved as follows. Sup- 
pose that all the roots ¢),...,¢, of g lie outside K. Consider the polynomial 
Ag, ++: Ac, f(z). Its degree is equal to 1, i.e., it is of the form c(z—k). Lemma 
1.1.11 implies that k € K. Since f and g are apolar polynomials, it follows 
that Ac, (z—k) = 0. On the other hand, the direct calculation of the deriva- 
tive shows that Ac,(z— k) = ¢; — k. Therefore k = ¢; ¢ K and we have a 
contradiction. 


Every polynomial f has a whole family of polynomials apolar to it. Having 
selected a convenient apolar polynomial we can, thanks to Grace’s theorem, 
prove that f possesses a root in a given circular domain. Sometimes for the 
same goal it is convenient to use Lemma 1.1.11 directly. 


Example 1. The polynomial 
f(z) =1—z+cz", where ceEC, 
possesses a root in the disk |z —1| < 1. 
Proof. The polynomials 


n 


1 


fle) =1+( i 


—1 
) a +cz” and g(z)=2"+ & ee eae ee | 


are apolar if 
_] 
1-n(=) bn-1 +cb9 =0, ie, 1L+b,-1+ cho = 0. 


Now let ¢, = 1 — exp(27ik/n) for k =1,...,n, and take g(z) to be 


n 


(2) = I[¢ ~~ Ce) = g" (1) bz" tees bo. 


Then 


bp—1 = —1 and bb) = +] [ G =0. 


Therefore the polynomials f(z) and g(z) are apolar. Since all the roots of g 
lie in the disk |z — 1] < 1, at least one of the roots of f lies in this disk. 


Example 2. The polynomial 1 — z+ ¢,z"! +---+c,z"*, where 1 < nj < 
ng <-+++< Mx, has at least one root in the disk 


Iz < 
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Proof. Let us start with the polynomial f(z) = 1 — z+ cz". Suppose 
on the contrary that all its roots lie in the domain |z| > 7*4. Then by 
Lemma 1.1.11 the roots of the polynomial 


Ao f(z) = m1 — (m1 — 1)z 


also lie in the domain |z| > ;“4;. But the root of Ao f(z) is equal to 
we have a contradiction. 
For the polynomial f(z) = 1— z+c.2™ +---+cpz"*, we use induction 


on k;. Consider the polynomial 


ny 
mor and 


M14 ...4 Cr-1(Nk — Np—1)z”"*-?. 


Ao f(z) = ng — (ne — Lz + ci(ng — 21)2 


Nk 


In this polynomial, replace z by re By the induction hypothesis, the 
roots of the polynomial obtained lie in the disk 


nz Ne-1 


lS a ng—1 Npe-1 — 1° 


and hence the roots of Ag f(z) lie in the disk 


l2| < Ny ne Uk 


fa—1 wen ~" wed 


Therefore the hypothesis that all the roots of f(z) lie outside the disk leads 
to a contradiction. 


Let f(z) = D0 (j)aiz! and g(z) = > (")b;z’. The polynomial 


h(z) = » (") agb;2" 


is called the composition of f and g. 


Theorem 1.1.12 (Szeg6). Let f and g be polynomials of degree n, and let all 
the roots of f le in a circular domain K. Then every root of the composition 
h of f and g is of the form —Gik, where ¢; is a root of g andke K. 


Proof. Let 7 be a root of h, ie., S> ("))aibi7’ = 0. Then the polynomials 


f(z) and G(z) = 2"g(—7z71) are apolar. Therefore, by Grace’s theorem, one 
of the roots of G(z) lies in K. Let, for example, g(—yk~') = 0, where k € K. 
Then —yk~! = ¢;, where ¢; is a root of g. 


For polynomials whose degrees are not necessarily equal, there is the fol- 
lowing analogue of Grace’s theorem. 
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Theorem 1.1.13 ([Az]). Let f(z) = = ({)aiz" and g(z) = y () diz" be 


wz = 
polynomials with m <n. Let the coefficients of f and g be related as follows: 


« he (") iby aa eye (™) Ambo = 0. (5) 


Then the following statements hold: 

a) If all the roots of g(z) belong to the disk |z| <r, then at least one of the 
roots of f(z) also belongs to this disk; 

b) If all the roots of f(z) lie outside the disk |z| <r, then at least one of 
the roots of g(z) also lies outside this disk. 


Proof. [Ru] a) Relation (5) is invariant with respect to the change of z 
to rz in f and g, and therefore we may assume that r = 1. Suppose on the 
contrary that all the roots of f(z) lie in the domain |z| > 1. Then all the 
roots of the polynomial z”f(+) lie in the domain |z| < 1. Therefore, from the 
Gauss-Lucas theorem (Theorem 1.2.1 on p. 13), it follows that all the roots 
of the polynomial 


file) = DO (2f (2) = n(n = 1)... (m+) 9 (“Jan 


i=0 


lie in the domain |z| < 1. Therefore all the roots of the polynomial 


fle) = my (a (2) = 3 (ae 


i=0 


lie in the domain |z| > 1. 
Relation (5) means that the polynomials fg and g are apolar. Since all the 
roots of f lie in the circular domain |z| > 1, it follows from Grace’s theorem 
that at least one of the roots of g also lies in this domain, and we have a 
contradiction. 
b) All the roots of fg lie in the domain |z| > 1, hence, it follows from 
Grace’s theorem that at least one of the roots of g also lies in this domain. 


1.1.5 The Routh-Hurwitz problem 


In various problems on stability one has to investigate whether all the roots 
of a given polynomial belong to the left half-plane (i.e., whether the real parts 
of the roots are negative). The polynomials with this property are said to be 
stable. The Routh-Hurwitz problem is 


how to find out directly by looking at the coefficients of the polynomial 
whether it is stable or not. 


Several different solutions of the problem are known (see, e.g., [Po2]). We 
will confine ourselves with one simple criterion given in [St3]. 
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First, we observe that it suffices to consider the case of polynomials with 
real coefficients. Indeed, if p(z) = S> anz” is a polynomial with complex coef- 
ficients we can consider the polynomial 


BY (2) = p(2)P@) = (ane) (Smez"). 
Clearly, the real parts of the roots of p(Z) are the same as those of p(z). 
Moreover, the coefficients of p*(z) are symmetric with respect to a, and Gy. 
This means that the coefficients of p* are invariant under conjugation, that 
is, they are real. 


Theorem 1.1.14. Let p(z) = 2" + az" 1 +-+++ an be a polynomial with 
real coefficients; let q(z) = z™ + b,z™-1+-+-+)m, where m = 4n(n —1), 
be the polynomial whose roots are all the sums of pairs of the roots of p. The 
polynomial p is stable if and only if all the coefficients of the polynomials p 


and q are positive. 


Proof. Suppose that p is stable. To a negative root a of p there corre- 
sponds the factor z — a with positive coefficients. To a pair of conjugate roots 
with the negative real part there corresponds the factor 


(z—a—18)(z -—a@+if) = 27 —2az+07 +6” 


with positive coefficients. Thus all the coefficients of p are positive. 

The complex roots of q fall into the pairs of conjugate roots because the 
coefficients of g are real. Further, the real parts of all the roots of ¢ are negative. 
The same arguments as for p show that all the coefficients of g are positive. 

Next, let all the coefficients of p and q be positive. In this case, all the real 
roots of p and q are negative. Therefore, if a is a real root of p, then a < 0, and, 
if v7 is a pair of complex conjugate roots of p, then 2a = (a+i3)+(a—if) 
is a root of q; hence 2a < 0. 


1.2 The roots of a given polynomial and of its derivative 


1.2.1 The Gauss-Lucas theorem 


In 1836, Gauss showed that all the roots of P’, distinct from the multiple 
roots of the polynomial P itself, serve as the points of equilibrium for the field 
of forces created by identical particles placed at the roots of P (provided that 
r particles are located at the root of multiplicity r) if each particle creates an 
attractive force inversely proportional to the distance to this particle. From 
this theorem of Gauss it is easy to deduce Theorem 1.2.1 given below. Gauss 
himself did not mention this. The first to formulate and prove Theorem 1.2.1 
was a French engineer F. Lucas in 1874. Therefore Theorem 1.2.1 is often 
referred to as the Gauss-Lucas theorem. 
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Theorem 1.2.1 (Gauss-Lucas). The roots of P’ belong to the convex hull 
of the roots of the polynomial P itself. 


Proof. Let P(z) = (2 — 21)-...+(@— Zn). It is easy to verify that 
P'(z) 1 1 
= discs 1 
P(z) = ve (1) 


Suppose that P’(w) = 0, P(w) #0 and suppose on the contrary that w does 
not belong to the convex hull of the points 21,...,2Z,. Then one can draw a 
line through w that does not intersect the convex hull of z1,...,2,. Therefore 
the vectors w— 21,...,W— Zn lie in one half-plane determined by this line. 
Hence the vectors 


a re £0, 


P(w) Al W — Zn 


This is a contradiction, and hence w belongs to the convex hull of the roots 
of P. 


Relation (1) allows one to prove the following properties of the roots of P’ 
for any polynomial P with real roots. 


Theorem 1.2.2 ({[An1]). Let 
P(z) =(z-—41)-...: (2-4), where 11 < +++ <p. 


If some root x; is replaced by x), € (a;,%i41), then all the roots of P’ increase 
their value. 


Proof. Let 2. < z2 < +++ < Zn—1 be the roots of P’, and let x1,...,2p, 
be the roots of P. Let zi < 24 <-+-- < 2/,_, be the roots of Q’ and let 
fy = Bi, s<.y Bg = B12, Bpg = Cass. +5 2, = Oy be the roots of Q. For 


the roots z, and z}, the relation (1) takes the form 


n 1 n 1 
dig a rial (2) 


i=l i=1 


Suppose that the statement of the theorem is false, i.e., 2, < zp for some k. 
Then 2/, — 2 < z, — x;. Observe that the differences z/,— x and z, — 2; are 
of the same sign. Indeed, 


ay <i ty, 2 <a, for <i —1 and 2; > a, 2 > m7; for j > 4. 


Hence, —+— < =4+,, for all i =1,...,n. But in this case relations (2) cannot 


> Zy—-24 za 


hold simultaneously. 
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1.2.2 The roots of the derivative and the focal points of an ellipse 


The roots of the derivative of a cubic polynomial have the following interesting 
geometric interpretation. 


Theorem 1.2.3 (van der Berg, [Be2]). Let the roots of a cubic polynomial 
P form the vertices of a triangle ABC in the complex plane. Then the roots 
of P’ are at the focal points of the ellipse tangent to the sides of AABC at 
their midpoints. 


First proof. Observe first of all that if Q(z) = P(z — zo), then Q’(z) = 
P'(z— 29). Therefore we can take any point for the origin. 

We can represent any affine transformation of the plane as a composi- 
tion of an isometry, a homothety, and a transformation of the form (a, y) 
(x,y cosa) in a Cartesian coordinate system. Therefore we may assume that 
the triangle ABC is obtained from the equilateral triangle with vertices w, 


207 
ew and e?w, where |w| = 1 and ¢ = exp (=). under the transformation 


z+Zz z-Z 


5 3 


Yad 


cos a = zcos” 5 + Zsin” > (1) 


Then the semi-axes a and b of the ellipse considered are equal to 4 and $ cos a; 
the distance between its focal points F, and F» is equal to Va? — b? = $ sin Qa. 
Under the dilation with coefficient 


i ie ( - a ay 
-— sin = {sin — cos — : 
ial QD 


points Fy and F) transform into (+1,0). The composition of transformation 
(1) and this dilation amounts to the transformation 


cot — + Zta. = 
Zi 2 —- yA n— 
2, 2 


Set a = weot $. Then the polynomial with roots A, B, and C is of the 


form i F ; 
P(x) = («-«-<) («ae -) («-a?- 4) 
a ae ae 


It is easy to verify that P’(x) = 3x? + 3e + 3F = 32? — 3, and therefore the 
roots of P’ are +1. 


Omi 
Second proof. [Sc5] Let « = exp (=) and let 29, 21, 22 be the roots of 


the polynomial P considered. Select numbers ¢o, ¢1, ¢2 so that 


w=O+Gte, n=O+Gethe, z2w=O04+Ge+be, (2) 
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i.e., 


3C0 = Zo+ 21 + 29, 31 = 20 T zye2 + 206, 3C2 = zo + 216 + 20€7. 


In what follows we assume that zp + 21 + z2 = 0, ie., Go = 0. 

It is easy to verify that the curve C,e’? + C2e~’”, where 0 < y < 2z, is 
an ellipse whose semi-axes are directed along the bisectors of the exterior and 
interior angles of the angle 7¢;O0 2, where O is the origin, and the lengths 
of the semi-axes are equal to |¢i| + |¢2| and ||¢i| — |¢2||. Indeed, the curve 
considered is the image of the unit circle under the map z (12 + (9%. 
Further, if ¢, = |GiJe’* and G2 = |@|e’, then 


Ge? Ge = |Glerrel+ ale”). 


The absolute value of this expression attains its maximum at y = ot? + kr 
and its minimum at y = wat +3 +kn. These values of y correspond precisely 
to the directions of the bisectors indicated. 

The focal points f; and fz of the ellipse C,e*? + C:e~*” lie on the line 
corresponding to the angle y = ote | 1:€., Lie isa positive number. Further, 
the square of the distance of the focal point to the center of the ellipse is equal 
to the difference of the squares of the semi-axes, i.e., it is equal to 


(11 + I2l)” — (1a  IGal)” = 41daGal- 


Hence fife = 412. 

Relations (2) for ¢) = 0 show that the vertices zo, 21, z2 of the triangle con- 
sidered lie on the ellipse C,e’” +C2e~*” and the mid-points of its sides lie on the 
ellipse 4 (Gre? + Ge *#), The mid-point of a chord of the first ellipse lies on 
the second ellipse only if this chord is tangent to the second ellipse. Therefore 
we have to prove that the focal points of the ellipse $ (Cie"” + G2e~"”) coincide 
with the roots of the derivative of the polynomial P = (z— z0)(z— 21)(z— 22). 
The focal points of the ellipse satisfy the equation z* — ¢,¢2 = 0, and the roots 
of P’ satisfy 


37? + ZQOZ1 + Z022 + 2122 = 0, i.e., 3(27 6102) => 0. 


1.2.3 Localization of the roots of the derivative 
Jensen’s disks 


Let f be a polynomial with real coefficients. For every pair of conjugate roots 
z and Z of f, the disk with diameter! zZ is called a Jensen’s disk of f. 


Theorem 1.2.4 (Jensen). Any non-real root of f’ lies inside or on the 
boundary of one of the Jensen’s disks of f. 


' We mean that z and Z are the endpoints of a diameter of this disk. 
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Proof. Let z1,...,2%n be the roots of f. Then 


fz@ ol 
fa) 222m w) 


Let us show first of all that if z lies outside Jensen’s disk with diameter zpZq, 
then 


1 1 
sgn Im ( + ) = —sgnImz. (2) 
Z—Zp 2—ZBq 
Indeed, 
ie 1 _ 2(z—a) ((z- a)? + b*) 
z—-a-bh  z-a+bi \(z — a)? + b?|? 
and 


Im ((Z — a)|z — al? + (z — a)b?) = (0? — |z — al?) Imz. 


Let us show now that if z ¢ R and z; =a €R, then 


1 
sen Im (—) = —sgnImz. (3) 
a £4 
Indeed, 
1 1  %-z — —2Imz 
z-a Z-a_ |z—al? |z-al?’ 


Formulas (1), (2), (3) imply that if point z ¢ R lies outside all the Jensen’s 
disks, then 
£2) 


f(z) 


Hence f’(z) £0, ie., z is not a root of f’. 


sgn Im = —senImz 4 0. 


As a refinement of Jensen’s theorem, we prove the following estimate for 
the number of the roots of the derivative whose real parts belongs to a given 
segment. 


Theorem 1.2.5 (Walsh). Let I = [a, 3], and let K be the union of I and 
Jensen’s disks intersecting I. If K contains k roots of a polynomial f(z), then 
the number of the roots of f'(z) that lie in Kk is between kk —1 and k +1. 


Proof. Let C be the boundary of the smallest rectangle whose sides are 
parallel to the coordinate axes and which contain K. Consider the restriction 


to C of the map z + e’”, where y = arg OE Formulas (1), (2) and (3) imply 
that the image of the part of C' that lies in the upper half-plane lies on the 
half-circle |z| = 1, Imz < 0, whereas the image of the part of C that lies in 
the lower half-plane lies on the half-circle |z| = 1, Imz > 0. Therefore the 
number of revolutions of the image of C around the origin is equal to either 0 
or +1. This means that the indices of C with respect to the vector fields f(z) 
and f’(z) either coincide or differ by +1, i.e., the total numbers of the zeros 
of functions f and f’ lying inside C either coincide or differ by +1. 
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Walsh’s theorem 


Theorem 1.2.6 (Walsh). Let the roots of the polynomials f, and fg lie in 
the disks Ky, and K with radii r; and r2 and centers at points c, and co, 
respectively. Then every root of the derivative of f = fifz lie either in Ky, or 
nary + N1re nac1 + 102 
—— centered at —————— 
ny + the Ny + Te 


in Ko, or in the disk of radius , where 


ny= deg fi and ng = deg fo. 
Proof. Let z be the root of f lying outside ky and K2. Then 


fil) faz) + filz) fa(z) = 9; 


moreover, f1(z), fa(z), fi (z), f$(z) are nonzero. 
Consider ¢; and ¢2, the centers of mass of the roots of f; and fg with 
respect to z, respectively. By Theorem 1.1.6 


file) 
f(z)’ 


fal) 
#2)" 


Gd=2z-m (g=z2-N2 


Hence 


fi(z) 4 fa(z) 


fie) fh) wear 


noi + n1d2 = (n1 + n2)z — ny Ne ( 


modi + n1G2 


i.e, 2= . Since all the roots of f; lie in K;, it follows that ¢; € K;. 
1 1 12 
It remains to observe that if points ¢; and C2 of mass nz and 7, lie in disks 


K, and Ko, respectively, then their center of mass z lies in the disk Kk. 


The Grace-Heawood theorem 


Theorem 1.2.7 (J. H. Grace, 1902; P. J. Heawood, 1907). If 21 and z2 
are distinct roots of a polynomial f of degree n, then the disk |z—c| <r, where 
c= 4 (24 + 22) andr = F122 oo) (=). contains at least one root of f’. 

n 


n—-1 
Proof. Let! f'(z) = > (7. jaxe™ Then 
k=0 


O= f(z2)— fla) [re dz= yn) e *) akbar: 


where the coefficients bo,...,b,—1 depend only on z; and zg and not on the 
coefficients ag,...,@n—1. Therefore, given z; and z2, we can construct a poly- 
nomial g(z) = 729 (";")bez* apolar to f’(z). 


' This expression of f’ differs by a factor of = from formula (*) in sec. 1.1.4. 
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To obtain an explicit formula for g, set a, = (—1)*a"~1~*, ie., consider 
h(z) = (a — z)"—1. In this case 


n—-1 
n-1 : 
gfe) = ("eae 
k=0 
Z2 


ic 21g Se Ser 


n 
Zz 


The roots of g are of the form 


Zy+22  .21— 22 
+2 


Ck = 5 


k 
cot — fork =1,2,...,.n—1 
nr 


and all of them lie on the boundary of the disk considered. Therefore, by 
Theorem 1.1.10 (see p. 8), the disk |z — c| < r contains at least one root 
of f’. 


In [Ma7], there are several other theorems on localization of the roots of 
the derivative. 


1.2.4 The Sendov-Ilieff conjecture 


In 1962, the Bulgarian mathematician B. Sendov made the following conjec- 
ture often ascribed to another Bulgarians mathematician, L. Tlieff: 


“Let P(z) be a polynomial (deg P > 2) all of whose roots lie in the disk 
|z| < 1. If zp is one of the roots of P(z), then the disk |z — z9| < 1 contains at 
least one root of P’(z)”. 


This conjecture is proved for all polynomials of degree < 5 and several 
particular polynomials (see, e.g., [Sc4]). 
We confine ourselves to the proof of the conjecture for polynomials of the 
form 
P(z) = (2 — 20)" (z — 21)" (2 — 22)”. 


This proof is given in [Co2]. 
The case when n = np +71 + n2 > 4 is much the simplest. In this case we 
have to prove that if |z;| <1 for i=0,1,2, then the polynomial 


P'(z) = n(z — 29)" "(2 = za) (z — 22)" "(2 —wi)(z— wa) (1) 


has a root lying in the disk |z — zo| < 1. If no > 1, then zp is such a root. 
We assume therefore that no = 1. Let us express P(z) in the form P(z) = 
(z — 29)Q(z). It is clear that 


P'(zo) = Q(z0) = (20 — 21)”" (20 — 22)”. (2) 
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It follows from (1) and (2) that 
n(zq — w1)(20 — We) = (20 — 21) (20 — 22). (3) 


Taking into account that |zo — z1| < |zo|/+|z1| = 2 and |zp — z2| < 2, we obtain 


24. 


|zo — wi|- |Zo — we| < - 
nr 


and hence either |zo — wi| < 1 or |zo — we| < 1. 

It remains to consider the case when np = ny = ng = 1. For this we need 
the following auxiliary statement which we will formulate more generally than 
is needed for this proof. 


Lemma. Let P(z) be a polynomial of degree n, where n > 2. If 
|P"(z0)| 2 (n — 1) |P"(z0)I, 
then at least one of the roots of P’ lies inside the disk |z — zo| < 1. 
Proof. Let w 1, w2,...,Wn—1 be the roots of P’. We may assume that the 
highest coefficient of P is equal to 1. In this case P’(z) = ATG — w;). If 
j=l 


j= 
P'(z) # 0 we may take the logarithm of both sides and differentiate. This 
gives 


By the hypothesis zo is a simple root of P, i.e., P’(zo) 4 0. Suppose that 
|zo—w,| > 1 for 7 = 1,...,n—1. Then the inequality |P’’(zo)| > (n—1) |P’(z0)| 
implies that 


1 
n-1l< P (20) (20) 
P'(z) 


n—1 1 
< ——_ <n-l 
> ea 


and we have a contradiction. 


Now let us consider directly the polynomial 


P(z) = (2 — 20)(2 — a) (2 — 22) = (2 — 20) Q(z). 


Clearly 
P"(z) Q'(z) 1 1 2(2z9 — 21 — 22) 
=2 =2 + = —___—_. 
P"(z) Q(z) %0— 21 20 — 22 (zo — 21)(20 — 22) 
Now consider the triangle ABC with vertices A = z, B= 2, C = 22. 
Obviously |zo — z1| = ¢, |20 — Z2| = b and |2z — z1 — z2| = 2ma, where 


Ma is the length of the median drawn from A. By Lemma the Sendov-Hlieff 
conjecture holds if 4m, > 2bc. 
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By the hypothesis, b < 2 and c < 2, and hence 2m, > bc holds both for 
Mq > b and for mg > c. It remains to consider the case when m, < b and 
Ma < C. 

Relation (3) shows that the Sendov-Ilieff conjecture holds if bc < 3. There- 
fore we may assume that bc > 3. In this case 


b? +c? = (b—c)? + 2be > 6, 


and therefore b? +c? —a? > 6—4>0,ie., ZA < 90°. The inequalities b > ma 
and c > m,q imply that ZC < 90° and ZB < 90°, and hence the triangle 
ABC is acute. 


Let R be the radius of its circumscribed circle, ha the length of the altitude 
2R 


from A. Then — = sinB = =, ie., be = 2Rha < 2Rma. To obtain the 
inequality desired, bc < 2m, it remains therefore to prove that R < 1. The 
acute triangle ABC lies inside the unit circle |z| = 1. If the circumscribed 
circle S of the triangle ABC lies inside the unit circle, the inequality R < 1 is 
obvious. Let now S$ and the unit circle have a common chord. Since ABC is 
acute, this chord subtends an acute angle y whose vertex coincides with one 
of the vertices of the triangle ABC. The same chord subtends the angles w 
and 180° — w, where w < 90°, whose vertices lie on the unit circle. Moreover, 
w < y. The inequalities ~ < y < 90° < 180° — w imply that R < 1. 


1.2.5 Polynomials whose roots coincide with the roots of their 
derivatives 


In the paper [Ya] it was stated that if P and Q are monic polynomials (i.e., 
their highest coefficients are equal to 1) and the sets of roots of P and Q 
coincide, and the sets of roots of the polynomials P’ and Q’ also coincide, 
then P™ = Q” for certain positive integers m and n. Later certain gaps 
were discovered in the proof of this statement and soon a counterexample 
was constructed in [Ro2]. The construction of this counterexample is rather 
complicated. We advise the interested reader to turn directly to [Ro2]. 

Concerning properties of polynomials whose roots coincide with the roots 
of the derivatives see also [Dol]. 


1.3 The resultant and the discriminant 


1.3.1 The resultant 

Consider polynomials f(x) = > ajx"~* and g(x) = > bjx™~", where ap 4 0 
i=0 i=0 

and bo # 0. Over an algebraically closed field, f and g have a common divisor 

if and only if they have a common root. If the field is not algebraically closed, 

then the common divisor could be a polynomial without roots. 
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The existence of a common divisor of f and g is equivalent, as one can 
show, to the existence of polynomials p and q such that fq = gp, where 
degp < n—1 and degg < m-—1. Indeed, let f = hp and g = hq. Then 
fa= hpq = gp. Suppose now that fg = gp, where deg q < degg — 1. If f and 
g do not have a common divisor, then q divides g: a contradiction. 

Let ¢ = up”! +--+ + uUm_1 and p = vga"! +--+ Un_1. The equality 
fa = gp can be expressed as a system of equations: 


aguo = bovo, 
+ bov1, 
- by v1 + bove, 


ayug + aguy = b1 v0 5 


agug + azU, + apug = bev 4 


The polynomials f and g have a common root if and only if this system has 
a nonzero solution (uo,1,---,Uo0,U1,---). If, for example, m = 3 and n = 2, 
the determinant of this system of equations is of the form 


ao 0 0 —bo O 
a, ao 0 —b; —bo 


ag a, ag 0 


0 


0 ao ay a2 0 


a2 a1 ag —b2 —b4] =+]0 O ao ay ag| = + det S(f,g). 
0 ag ay, —b3 —bo bo by bo bs 0 
0 0a. 0 —b3 0 bo by be bg 
The matrix 
ag a, ag 0 0 
0 ag ay a2 0 
S(f,g) =] 0 0 agp ai ag 
bo by bz bs 0 
0 bo by be b3 


is called the Sylvester matrix of the polynomials f and g. The determinant 
of S(f,g) is called the resultant of f and g and is denoted by R(f,g). Clearly, 
R(f,g) is a homogeneous polynomial of degree m with respect to indetermi- 
nates a; and of degree n with respect to indeterminates b;. The polynomials 
f and g have a common divisor if and only if the determinant of the system 
considered vanishes, i.e., R(f,g) = 0. 

The resultant has many different applications. For example, given poly- 
nomial relations P(x,z) = 0 and Q(y,z) = 0 we can, with the help of the 
resultant, obtain a polynomial relation of the form R(x, y) = 0, ie., eliminate 
z. Indeed, consider the given polynomials P(x, z) and Q(y, z) as polynomials 
in z regarding x and y as constants. Then the vanishing of the resultant of 
these polynomials is exactly the relation desired R(x, y) = 0. 

The resultant also allows one to reduce the solution of the system of alge- 
braic equations to the search for roots of polynomials. Indeed, let P(2o, yo) = 0 
and Q(2o,yo) = 0. Consider P(x,y) and Q(x,y) as polynomials in y. For 
Xx = Xo, they have a common root yo. Therefore their resultant R(x) vanishes 
at x = Xo. 
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Theorem 1.3.1. Let x; be the roots of f, and y; the roots of g. Then 


R(F, 9) = af°65 | [ (a: — ys) = ot | J oes) = 85 [] F)- 


Proof. Since f(a) = ao(x— 41)-...-(a—@,), it follows that 


ap = £a00n(@1,---, Ln), 


where ox is an elementary symmetric function. Similarly, 


be = £boon(y1,---;Ym)- 


The resultant is a homogeneous polynomial of degree m with respect to inde- 
terminates a; and of degree n with respect to the b;. Hence 


ACT, g) =a5° bo P (Lis caig Ba Yirie sUa)s 


where P is a symmetric polynomial in 71,...,%, and yj,..-,Ym vanishing at 
x; = yj. The formula 


shows that 


PUR xs en) = (te Gy IQ Big .as ipa UU ig aye cs ey: 


Substituting 7; = y; into this formula we see that U is the zero polynomial. 
Similar arguments show that P is divisible by S = af'bj [] (ai — yj). 


Since g(x) = bo [[ (x—y,), we have [] g(x) = 6G [](ai—y,;), and therefore 
a,j 


j=l i=l 


S = ag" [[9@a = ay’ [[ Goa” + ha +++++ bm) 


i=1 i=1 
is a homogeneous polynomial of degree n in indeterminates bo,..., 0. For 
indeterminates ag,...,@n, the arguments are similar. It is also clear that the 


symmetric polynomial a7” []_, (box?” + bia} +-+++ bm) is a polynomial 
in ao,---,@n,00,---,bm. Hence R(f,g) = R(ao,..-, 0m) = AS, where A is a 
number which does not depend on the a; and 0;. 

On the other hand, the coefficient of [] 27” in aj’bj, P(a1,...,ym) and S is 
equal to aj’bg, hence, A = 1. 


Corollary 1. R(g, f) = (—1)**4 "9 R(f, 9). 
Corollary 2. If f =gq+r, then 
R(f,g) = B98" Rr, 9), 


where bo is the leading coefficient of g. 
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Proof. Let y; be the roots of g. Then f(y;) = r(y;). It remains use that 
R(f,9) = 65°! TI f(yy) and R(r, 9) = 65°" TT] F(ys)- 


Corollary 3. R(f,gh) = R(f,g)R(f,h) 


Proof. Let x; be the roots of f and ag its leading coefficient. Then 


R(f, gh) = oo [[g@ar@, 
R(f,9) = 05°" TT o(2s); 
Rif) ag” TT ae). 


Theorem 1.3.2. Let f(x) = x az”—* and g(x) = > bja™—*. Then there 


exist polynomials ~ and w with inieden coefficients in indeteriitiaies OG jit Ong 
bo,---,0m and x for which the identity 


y(z, a, b) f (x) ae W(a, a, b)g(z) = R(f, 9) 
holds. 


Proof. Let co,..-;Cn+m—1 be the columns of the Sylvester matrix S(f,g) 
snd q = Pr? -"—*.. Then 


YoCo ae Yn+m—1En+m—-1 = ¢, 
where c is the column vector 


(e™-1 f(x), i ., f(x), 2" 9(x), crane g(a) . 


Consider yo,.--,;Yn+m-—1 as a system of linear equations for yo,..-,Yn+tm—1 
and make use of Cramer’s rule in order to find yn4m—1. We obtain 


Ynt+tm—-1 det(co, epacis »Cn4+m-—1) = det(co, see 9 Cn4t+m—2; C). (1) 


It remains to notice that Yynt+m—1 = 1, det(co,.-.,Cn4m—1) = R(f,g) and 
the determinant on the right-hand side of (1) can be represented in the form 
desired, i.e., as p(x, a,b) f(x) + W(2, a, b) g(x). 


1.3.2 The discriminant 


Let 21,...,2%n be the roots of the polynomial f(x) = agx” +---+ an, where 
ag # 0. The quantity 


ria 2 : 
[] (a: - 45)? 
i<j 


is called the discriminant of f. 
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Theorem 1.3.3. R(f, f’) = aoD(f). 
Proof. By Theorem 1.3.1 we have R(f, f’) = ag7! lif (x;). It is easy to 


verify that f’(x;) = ao [| (a; — v;). Therefore 
xi 


R(f, if j= i [[@ = Xi) = ce ae [[@ a 


Ft i<j 


Remark. It is not difficult to show that 


Rf, f’) = -R(f', f) = (-)"" Pao D(f). 


Corollary. The discriminant of f is a polynomial in ao,...,@n with in- 
teger coefficients. 


Theorem 1.3.4. Let f, g, and h be monic polynomials. Then 
D(fg) = D(f)D(9)R*(f.9) 
D(fgh) = D(f)D(g)D(h)R°(f, 9) R°(g, h)R(h, f). 


Proof. Let x1,...,%, be the roots of f, and y,...,Ym the roots of g. 
Then 


D(fa) = [| @ - 2)? [[@ — 95)? []@ - v9)? = DP) DG@) RC, 9). 


The second formula is proved similarly. 


Theorem 1.3.5. Let f be a real polynomial of degree n without real roots. 
Then sen D(f) = (—1)"/?. 


Proof. Making use of the factorization 
f(x) = ao(a@ — 41) -...+(@ — ap) 


it is easy to verify that 


D ((w—a)f(x)) = D(f(#)) (f(a))’. 
Let a and @ be a pair of conjugate roots of f,ie., f(a) = (w«—a)(x—@)g(a). 
Then 
D(f(a)) = D (g(x) (a—) (faf@)° 
Clearly, sgn(a — @)2 = —1 and (f(a)f(a))? = |f(a)|\* > 0. Therefore 
sgn D(f) = —sgn D(g). Now it is easy to obtain the statement required by 
induction on n. 


Theorem 1.3.6. Let f(x) = a” + aya"! + +++ + an be a polynomial with 
integer coefficients. Then its discriminant D(f) is equal to either 4k or 4k+1, 
where k is an integer. 
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Proof. Let %1,...,%, be the roots of f. Then 


D(f) = 6(f), where o(f) = ]](«i —2;). 
i<j 
Consider an auxiliary polynomial 6,(f) = [](#; + 2;). Clearly, 6i(f) is a 
i<j 
symmetric function of the roots of f, and hence 6;(f) is an integer. Moreover, 


&(f) — &(f) = II ((ai — xj)? + 4ay2x;) — [[@ —2;)? =4U (21,.+.,2n), 


where U is a symmetric polynomial in 271,...,2, with integer coefficients. 
Therefore D(f) = 67(f) + 4k1, where k; is an integer. It is also clear that 
O77) = Ako or Ako +1. 


1.3.3 Computing certain resultants and discriminants 


In this section we give several examples on how to compute resultants and 
discriminants. 


Example 1.3.7. D(z” +a) = (—1)?-VP nar, 


Proof. Let us make use of the fact that 
Dis (ly RG ely rie), 


where 21,...,2n are the roots of f. In our case f’(x 


=e” * end Ta = 
(—1)"a, and therefore [[ z?—1 = (—1)?™—Ya"-1 = a1. 


Example 1.3.8. Let p(x) = a"—!+---+1. Then D(y) = (-1)°-VO-D/Anr-2, 
Proof. Since (a — 1)y(x) = #” — 1, it follows that 


D(v) (p(1))* = D((e« — Yy(a)) = D(a” — 1) = (OVOP nn, 


It remains to observe that y(1) = n. 


Example 1.3.9. Let fr(v) =1+a+ = eee x. Then 


D(n!fn) = (—1)n@-PP nbn, 


Proof. The polynomial g, = n!f, is monic, and hence 


Dig) = (-1) PRG, g = (-1y 8 []s'@, 
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where Q1,...,Q@y, are the roots of f,. Clearly, 


g (a4) = nl f(a) = ! fa—1(04) = n! (Fata. = = —ay. 


Therefore 


It remains to observe that [[ a; = (—1)"g(0) = (-1)"nl. 


Example 1.3.10. Let d= (r,s), m1 = ” and s.= =. Then 


R(a” — a,a° — b) = (—1)* (a — b")*. 


Proof. The relation R(g, f) = (—1)9*8f 4°89 R(f, g) shows that if the de- 
sired statement holds for a pair (r,s), then it also holds for a pair (s,7r). 
Indeed, (—1)"*+4+" = (—1)*. We may therefore assume that r > s. 

For s = 0, the statement is obvious. If s > 0, then having divided x” — a 
by «* — b we get the residue ba’ * — a. Hence 


R(a” —a,x° — b) = R(ba"™ * —a,2° — b) = 
= R(b,2° —b)R me Le eo b) = 


It is easy to see that if R(x”’~*—%,x°—b) = (-1)$ ((¢)* — 6" ~*), then 


R(x" —a, 2° —b) = (—1)* (a — b")*. 


It remains to use induction on r+ s. 


k 
Example 1.3.11. Letn>k>0,d=(n,k), ny = + and ky = 7 
D(a” + ax* + b) = 
(19-0 pet (n™ pik 4 (-1)™41(n _ k)mi—* kka a") 
Proof. [Sw] The formula D(f) = (—1)"-Y/?R(f, f’) gives 


D(z” + axz™ + 6) = (21/8)? Rie + aax* + b,na"—1 + kax*-1) = 
_ (1)? “2a Ra +4 ax* +4 b, grt He n-'kax*-), 


Using the fact that 


R(f,2"9) = R(f,x™)R(f, 9) = (F(0))™ RU, 9), 
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we obtain 
Da an" +5) =(—1)""- 97 nb Rie? ae” +b * ha). 


The residue after the division of 2” + ax* + b by 2”—* + n-tka is equal to 
a(1—n-tk)ax* +b, and hence 


R(x” + ax” + b,x"-* +n7'ka) = R(a(1—n-tk)a* +b,2"-* +n7'ka). 


The resultant of a pair of two binomials is computed in Example 1.3.10. 


1.4 Separation of roots 


Here we discuss various theorems which allow us to compute, or at least 
estimate from above, the number of real roots of a polynomial on a given 
segment (a, b). Formulations of such theorems often use the notion the number 
of sign changes in the sequence ao, @1,.--,@n, where aga, # 0. This number 
is determined as follows: all the zero terms of the sequence considered are 
deleted and, for the remaining non-zero terms, one counts the number of 
pairs of neighboring terms of different sign. 


1.4.1 The Fourier—Budan theorem 


Theorem 1.4.1 (Fourier—Budan). Let N(x) be the number of sign changes 
in the sequence f(x), f'(x), ..., f(x), where f is a polynomial of degree n. 
Then the number of roots of f (multiplicities counted) between a and b, where 
f(a) £0, f(b) 40 and a < b, does not exceed N(a) — N(b). Moreover, the 
number of roots can differ from N(a) — N(b) by an even number only. 


Proof. Let « be a point which moves along the segment [a, b] from a to b. 
The number N(a) varies only if x passes through a root of the polynomial 
f(™ for some m <n. 

Consider first the case when x passes through a root xo of multiplicity 
r of f(x). In a neighborhood of zo, the polynomials f(x), f’(x),..., f(x) 
behave approximately as 


(x —29)"g(a0), (2 —20)" 'rg(xo), ..., r'g(xo), 


respectively. Therefore, for x < xo, there are r sign changes in this sequence 
and for « > xo there are no sign changes (assuming that wx is sufficiently close 
to Xo). 

Now suppose that x passes through a root x9 of multiplicity r of f°”; let 
xo be not a root of f("—). (Of course, xo can be a root of f as well, as it can 
be not a root of f.) We have to prove that under the passage through xo the 
number of sign changes in the sequence f("—) (x), f(a), ..., f™*") (2) 
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changes by a non-negative even integer. Indeed, in a vicinity of x these poly- 
nomials behave approximately as 


F(a), (a — 20)"G(ao), (x — 20)" 'rG(a), ..., r!G(zo). (1) 


Excluding Fao), we see that the remaining system has exactly r sign changes 
for x < x and no sign changes for 7 > ap. Concerning the first two terms, 
F (ao) and (a — 29)"G(axo), of the sequence (1) we see that if r is even, then 
the number of sign changes is the same for 7 < vq and x > 2 whereas if r is 
odd, then the number of sign changes for x < 29 is by 1 greater or less than 
for « > xp (depending whether F(x) and G(ao) have the same sign or the 
opposite sign). Thus, for r even, the difference in the number of sign changes 
is equal to r and, for r odd, the difference of the number of sign changes is 
equal to r +1. In both these cases this difference is even and non-negative. 


Corollary 1. (The Descartes Rule) The number of positive roots of the 
polynomial f(x) = apx” + aya"! ++--+ an does not exceed the number of 
sign changes in the sequence ao,Q1,...,Gn.- 


Proof. Since f‘(0) = rlan_y, it follows that N(0) coincides with the 
number of sign changes in the sequence of coefficients of f. It is also clear 
that N(-+00o) = 0. 


Remark. Jacobi showed that the Descartes Rule can be used also to es- 


timate the number of roots between @ and (. To this end one should make 


fail By and consider the 


L- a 
the change of variables y = phen 6b B= 
—2 


polynomial 


a+ By 
1l+y 


at's ( ) = boy” + bry? * ++++ + bn. 

The Descartes Rule applied to this polynomial yields an estimate of the num- 
ber of roots between a@ and (@. Indeed, y varies from 0 to oo, as x varies from 
a to £. 


Corollary 2. (de Gua) If the polynomial lacks 2m consecutive terms 
(i.e., the coefficients of these terms vanish), then this polynomial has no less 
than 2m imaginary roots. If 2m-+1 consecutive terms are missing, then if 
they are between terms of different signs, the polynomial has no less than 2m 
imaginary roots, whereas if the missing terms are between terms of the same 
sign the polynomial has no less than 2m +2 imaginary roots. 


In certain cases the comparison of the sign changes in two sequences allows 
one to sharpen the estimate of the number of roots as compared with the 
estimate given by the Fourier-Budan theorem. The first to formulate this 
type of theorem was Newton but it was proved (by Sylvester) much later, 
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in 1871. Let us replace the sequence f(x), f’(x),... f(z) by the sequence 
fo(x), fi(x),..-, fn(x), where 


(n — i)! 


fila) = G—* pO(ay, (2) 

and consider one more sequence Fo(x), Fi(x),...,Fn(x), where Fo(x) = F(a), 
F(z) = f2(@) and 

Fi(z) = f?(2) — firl@)finn(e), t=1,...,n-1. (3) 


Convention 1.4.1 Let us take into account only the pairs fi(x), fiqi(x) for 
which sgn Fj (x) = sgn Fy41(2). 


Let N(x) be the number of pairs for which sgn f;(x) = sgn fi4i(a) and 
let N_(a) be the number of pairs for which sgn f;(a) = — sgn fi41(2). 


Theorem 1.4.2 (Newton-Sylvester). Let f be a polynomial of degree n 

without multiple roots. Then the number of roots of f between a and b, where 

a <b and f(a) f(b) 4 0, does not exceed either Ni(b) — Nz(a) or N_(a) — 
b). 


Proof. First consider the case when f satisfies the following conditions: 


1) no two consecutive polynomials f; have common roots; 
2) no two consecutive polynomials F; have common roots; 
3) the roots of f; and F; are distinct from a and b. 


In this case formula (3) implies that f; and F; have no common roots. 
It is easy to derive from (2) and (3) that 
fi =(n-‘)fiar, (4) 


fFi = (n-i- 1) + Rafi). (5) 


Let « move from a to b. The numbers N..(a) only vary if a passes either 
through a root of f; or through a root of F;. Consider separately the following 
three cases. 

Case 1: the passage through a root xo of fo = f. If fo(xo) = 0, then 


F, (20) = f7(xo) — fo(to) f2(t0) = f7 (#0) > 0. 


Therefore the passage through x9 does not involve a change of sign in the 
sequence Fo(#) = 1, F\(x). Formula (4) implies that sgn f’(x) = sgn fi(a). 
Therefore, if fi(ao) > 0, then fo(ao — €) < 0 and fo(ao + €) > 0, whereas if 
fi(xo) <0, then fo(vo — €) > 0 and fo(xo +) < 0. In both cases 


fo(xo _ €) fi(xo _ a) <0 and fo(xo + €) fi(xo + €) > 0. 
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Thus, the passage through wo increases N, by 1 and decreases N_ by 1. (We 
only consider the contribution to N+ of the pair fo, fi.) 

Case 2: the passage through a root xo of the polynomial f;, where i > 1. 
In this case the change of signs occurs in the sequence fi-1, fi, fi+i. The 
possible variants of the signs of the polynomials considered at x = xp) + € are 
considerably restricted by the following relations: 


1)sgn fi+1 = sgn fi due to (4); 

2) sgn F;(xo) = sgn (f?(xo) — fi-1(#0) fit1(@0)) = 
— sgn (fi-1(%o)fi+1(@0)) due to (3); 

3) sen Fi+1 = sgn ae ae 


If F;(ao) < 0 the sign changes occur in the pairs F;_1, F; and Fj, Fi41 
but, by Convention 1 just before the theorem, we do not consider such pairs. 
If F;(ao) > 0, then fi-1(0) fizi(ao) < 0. The signs of the polynomials fj_1, 
fi; fiz considered at x = xp +e are completely determined by the sign of 
fi+i(%o). For both values of the signs, the pairs f;-1(ao — €), fi(ao — €) and 
filo —€), fi41(vo — €) contribute to Ny and N_, respectively, and then the 
pairs fi_-i(ao +e), fi(ao +¢) and fi(ao +), fizi1(%o +€) contribute the other 
way round to N_ and Nj, respectively. Thus, their total contribution to N+ 
as well as to N_ does not vary. 

Case 8: passage through a root xo of F;. In this case the signs of the 
polynomials satisfy the following relations: 


1) fi-1(@0) fern (0) = F7 (x0) — Fi(wo) = fF} (20) > 0; 


2) sgn f; = sen fi+1; 
3) formula (5) implies that sgn FY = sen fi—1 fii Fist. 


An easy perusal of the possible scenarios shows that either both N 
and N_ do not vary, or Ny increases by 2, or N_ decreases by 2. 

It remains to explain how to get rid of conditions 1)—3) imposed on ff. 
If some of these conditions are not satisfied, then after a small variation of 
the coefficients of f these conditions will be satisfied. But the roots of f are 
simple ones, and therefore the number of roots of f lying strictly inside the 
segment [a,b] does not vary under a small variation of the coefficients. 


Remark. For the polynomial f with multiple roots, one should make use 
of a slightly more subtle argument. Namely, one should consider not arbi- 
trary small variations but only those for which the real root of multiplicity r 
splits into r distinct real roots. To produce such a small variation, it is more 
convenient to modify the roots of the polynomial rather than its coefficients. 


1.4.2 Sturm’s Theorem 


Consider the polynomials f(x) and f(x) = f’(a). Let us seek the greatest 
common divisor of f and f; with the help of Euclid’s algorithm: 
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f=ahi— fe, 
fi = ga fe — fa, 


fn—2 = dn—1fn—1 _ Tins 
fn-1 = dnFn- 


The sequence f, f1,---, fn—1, fn is called the Sturm sequence of the polyno- 
mial f. 


Theorem 1.4.3 (Sturm). Let w(x) be the number of sign changes in the 
sequence 

f(x), fil@), -++5 fa(@). 
The number of the roots of f (without taking multiplicities into account) con- 


fined between a and b, where f(a) 4 0, f(b) 4 0 anda < b, is equal to 
w(a) — w(b). 


Proof. First, consider the case when the roots of f are simple (i.e., the 
polynomials f and f’ have no common roots). In this case f, is a nonzero 
constant. 

Let us verify first of all that when we pass through one of the roots of 
polynomials fy,...,fn—1 the number of sign changes does not vary. In the 
case considered, the neighboring polynomials have no common roots, i.e., if 
f(a) = 0, then f,+1(a) 4 0. Moreover, the equality f,-1 = qr—-1f — fr4i 
implies that f,_1(a@) = —f;r41(a). But in this case the number of sign changes 
in the sequence f;—1(@), €, fr+1(@) is equal to 2 both for « > 0 and for e < 0. 

Let us move from a to b. If we pass through a root xo of f, then first the 
numbers f(a) and f’(x) are of different signs and then they are of the same 
sign. Therefore the number of sign changes in the Sturm sequence diminishes 
by 1. All the other sign changes, as we have already shown, are preserved 
during the passage through zo. 

Now consider the case when 29 is a root of multiplicity m of f. In this case 
f and f, have a common divisor (a — xq)™~!, and hence the polynomials are 
divisible by (2 —29)"™—+. Having divided f, fi, ..., f, by (ec —20)™~ +! we ob- 

f (2) 

(x = xo)™} . 
The root 29 is a simple one for y, and hence the passage through xp increases 
the number of sign changes in the sequence y, Y1,..-, Yr by 1. But for a fixed x 
the sequence f, fi, ..., f, is obtained from y, 1, ..., yY,- by multiplication 
by a constant, and therefore the numbers of sign changes in these sequences 
coincide. 


tain the Sturm sequence 9, y1,..., , for the polynomial y(a) = 


1.4.3 Sylvester’s theorem 


To compute the Sturm sequence is rather a laborious task. Sylvester suggested 
the following more elegant method for computing the number of the real roots 


32 1 Roots of Polynomials 


of the polynomial. Let f be a real polynomial of degree n with simple roots 
Q1,--+,Qn. Set s, =ak +---+ak. (Clearly, to calculate sz, one does not have 
to know the roots of the polynomial because s;, being a symmetric function, 
is expressed in terms of the coefficients of the polynomial.) 


Theorem 1.4.4 (Sylvester). a) The number of the real roots of f is equal 
to the signature of the quadratic form with the matrix 


So S51. Sn—1 
S1 82 Sn 
Sn—-1 Sn +++ S2n 


b) All the roots of f are positive if and only if the matrix 


81 SQ ... Sn 
SQ §3 «++ Sn4l 
Sn Sn4+1 +++ S2n41 


is positive definite. 


Proof. (Hermite) Let p be a real parameter. Consider the quadratic form 


yi a, 
F(¢1,...,2,) = —— +--+ 7 (1.1) 
ai+p Qn + Pp 
where yp = 21 + Opto t+-:-+al lay. (1.2) 


The coefficients of the polynomial F' are symmetric functions in the roots 
of f, and hence they are real. In particular, this means that the form F’ can 
be represented as 


Ai tes + hg hoy he, 


where hy,...,h» are linear forms in 71,...,2, with real coefficients. 
To the real root a, there corresponds the summand 


y? (1 + peg tee + re ams ON 
ar +p ar + p 


This summand can be represented in the form +h?, where the plus sign is 
taken if a, +p > 0 and the minus sign otherwise. 
The contribution of a pair of conjugate roots a, and a, is equal to 
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Yy 


Ar +p 
= — ip. Therefore 


Let yr = u+iv and 


= A+ ip, where u,v, A, are real numbers. Then 


=u-—ivand 
is As +p 


Fy.3 = 2d(u? — v7) — 4yuv. 


For u = 0 and for v = 0, the values of F;.., have opposite signs. Hence after a 
change of variables we may assume that F,,, = u? — vj. 
As a result we see that all the roots of f are real and satisfy the inequality 
a, > —p if and only if the form (1) is positive definite. The matrix elements 
of this form are si ch ee 
ea pigacdle On : 
ai+p An +p 


a= 


Statements a) and b) are obtained by going to the limit as p —> +00 and 
taking p = 0, respectively. 


The quadratic form that appears in Sylvester’s theorem has quite an in- 
teresting interpretation. This interpretation will enable us to obtain another 
proof of Sylvester’s theorem; moreover, even for polynomials with multiple 
roots. 

Consider the linear space V = R{s|/(f) consisting of polynomials consid- 
ered modulo a polynomial f € R{a]. We assume that f is monic and deg f = n. 
The polynomials 1,z,...,2”~! form a basis of V. To every a € V, we may 
assign a linear map V — V given by the formula v +> av (since the elements 
of V are polynomials we can multiply them). Let tr(a) be the trace of this 
map. Consider the symmetric bilinear form 


p(v, w) = tr(vw). 


Theorem 1.4.5. a) Let f(x) = (a — ay) + +--+ (@— ay) € Ria] and s, = 
ak +--.+a*. The matrix of p in the basis 1, x, ..., x"~1 has the form 
So S1. Sn—1 
S, S92. Sn 
Sn—-1 Sn +++ S2n 


b) The signature of the form ~ is equal to the number of distinct real roots 


of f. 


Proof. a) Over C, the polynomial f can be factorized into the product 
of relatively prime linear factors f = fy"! -...- f7"". Thanks to the Chinese 
remainder theorem (Lemma on p. 69) the map 


h (mod f) (hk (mod f7""),...,h (mod f7"")) 
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determines a canonical isomorphism 


Cla]/(f) = Clal/(F") x +++ x Cla]/(F""). 


In this decomposition the factors are orthogonal with respect to y. Indeed, let 
polynomials h; and h; correspond to factors with distinct numbers 7 and j, 
iLe., hy =0 (mod f/ fj") and hj =0 (mod f/f;"’). Then hjhj = 0 (mod f), 
and therefore the map v + hjhjv is the zero one. Hence its trace vanishes. 
Therefore y = y1+--:+r, where y; is the restriction of y onto the subspace 
Cla]/(f7"*) = C[z]/(x — a;)™. It remains to verify that y;(1, 2") = mak. 


It is easy to calculate the matrix of the form y; in the basis 


1, t—o,, 0.05 @-—ajy)y™. 


Indeed, in this basis the map v + (x —a;)*v has a triangular matrix; and the 
trace of this matrix is equal to m; if k = 0 and to 0 if k > 0. Since 


0 = vill, — ai) = vi(1, x) — ayi(l,1) = gi(1, 2) — mii, 
it follows that y;(1,v) = mja;. Next, with the help of the equality 
yi (1, (x — ai)*) =0 
k 


and induction on k we see that y;(1,2") = mja®. 

b) Computing the signature we must remain in R, and therefore we de- 
compose f over R into the product of relatevely prime linear or quadratic 
factors: f = fj"'-...- f”". Again consider the decomposition 


Rial/(f) = Rial/(F") «+ x Rial/F"). 


It suffices to verify that the signature of the restriction of y onto R[z]/(f;"") 
is equal to 1 if deg f; = 1 and to 0 if f; is an irreducible over R polynomial of 
degree 2. As we have already established, in the basis 1, x—a;, (x —a;)"™*~1, 
the matrix of y; is equal to 


00... 0 
00... 0 


Therefore if deg f; = 1 the signature of y; is equal to 1. 

If f; is an irreducible over R polynomial of degree 2, then R{s]/(f;"") = 
R{[z]/(x* +1)". Here we mean an isomorphism over R. Therefore it suffices 
to calculate the signature of y on R[z]/(x?+1)™. It is convenient to calculate 
the matrix of y in the basis 


1, 27, 2? -+1, o(z? +1), (2? +1)?,...,2(@7 +1)""1, (2? +1)". 
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In this basis, the operators of multiplication by x and x? have matrices 


Pt Oe 0Oy., “0 4: 00.2 
AO 4-0: 0 =10 10. 
00010...} ang | 0 0 -101...]) 


0 0-101... 0 0 0-10... 


respectively. Therefore the trace of the operator of multiplication by x is equal 
to 0 and the trace of the operator of multiplication by x? is equal to —2m. 
The operators of multiplication by 27(x? + 1)*, where a= 0, 1, 2 andk > 1, 
are represented by diagonal matrices with zero main diagonals; their traces 
vanish. As a result, we see that the matrix of the form y is equal to 


2m 0 0... 0 


0 —2m0...0 
0 0 0...0 
a aes 
0 0 0...0 


The signature of such a form is equal to zero. 


1.4.4 Separation of complex roots 


Sturm’s theorem enables one to indicate algorithmically a set of segments that 
contain all the real roots of a real polynomial and, moreover, each such seg- 
ment contains precisely one root. In a series of papers (1869-1878), Kronecker 
developed a theory with an algorithm to indicate a set of disks which contain 
all the complex roots of a complex polynomial so that each disk contains ex- 
actly one root. More exactly, Kronecker showed that the number of complex 
roots inside the given disk can be computed with the help of Sturm’s theorem. 

Let z = «+ iy. Let us represent the polynomial P(z) in the form P(z) = 
p(x, y) +iv(a, y). We will assume that P has no multiple roots, i.e., if P(z) = 
0, then P’(z) £ 0. 

To every root of P, there corresponds the intersection point of the curves 
y = 0 and w = 0. Therefore the number of roots of P lying inside a closed 
non-self-intersecting curve y is equal to the number of the intersection points 
of the curves y = 0 and w = 0 lying inside y. This number can be calcu- 
lated as follows. Let us circumscribe the curve ¥ in the positive direction, i.e., 
counterclockwise, and to each intersection point of the curves y and y = 0 we 
assign the number ¢; = +1 according to the following rule: e; = 1 if we move 
from the domain yw > 0 to the domain yw < 0, or ¢; = —1 if, the other way 
round, we move from the domain yy < 0 to the domain yw > 0. 

In the general position the number of intersection points of the curves y 
and y = 0 is even (since at every intersection point the function y changes 
its sign), and hence }> ¢; = 2k, where k is an integer. 
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Theorem 1.4.6 (Kronecker). a) The number k is equal to the number of 
intersection points of the curves p = 0 and w = 0 lying inside the curve y. 

b) If y is a circle of given radius with given center, then for the given 
polynomial P the number k can be algorithmically computed. 


Proof. a) Clearly, dP(z) = (pz + ivz)dx + (py — ipy)i dy. Hence 
Ya + iby = P'(z) = Wy — Wy, 
and therefore 
Wy = Yr and rz = —Yy 


(the Cauchy-Riemann relations). Therefore 


bx Py} __ |\Ox dy 
Wer Py dy —oe 


This means that the rotation from the vector grad y = (Yz, Yy) to the vector 
gradw = (Wz, Py) is a counterclockwise one. Geometrically this means that 
the domains yw > 0 and yw < 0 are positioned as shown in Fig. 1.1. 


=¢,+ 92 >0. 


FIGURE 1.1 


Let us contract the curve y into a point. Under the passage through the 
intersection point of the curves y = 0 and w = 0 the number k diminishes by 
1 (Fig. 1.2) and under the reconstruction depicted on Fig. 1.3 the number & 
does not vary. It is also clear that when the curve becomes sufficiently small 
it does not intersect the curves y = 0 and w = 0, and in this case k = 0. 

b) The circle of radius r and center (a,b) can be parameterized with the 
real parameter t as follows: 


1-?? eT Qt 
y= Tae 


Having substituted these expressions into y(a, y) we obtain a polynomial &(t) 
with real coefficients. The real roots of this polynomial correspond to the 
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FIGURE 1.2 FIGURE 1.3 


intersection points of the curves y and y = 0. By Sturm’s theorem, for every 
root, we can find a segment that contains it. Having calculated the sign of the 
function yw at the endpoints of this segment one can find the corresponding 
numbers €;. 


1.5 Lagrange’s series and estimates of the roots of a 
given polynomial 


1.5.1 The Lagrange-Biirmann series 


Recall that if f(z)= 3° en(z—a)", then 


n=— Co 


a [te dz=c_1, 
rs 


where ¥ is any curve circumscribing point a. We will use this fact to obtain 
the expansion of the function f(z) into a series in powers of y(z) — b, where 
b = y(a). To be able to do so, the function y(z) should be invertible in a 
neighborhood of a, i.e., y/(a) 4 0. If y(z) is invertible, then 


Pea __ Fee) _ fF) , 
p(z) — y(a) yl(a)(z —a)+--- z—a oriecg 


and hence 
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Having integrated this identity, we obtain 
1 / / 
y= [104 am fw) e'® dw de. 


Let us transform the expression obtained having separated the terms y(z) —), 
where b = y(a): 


fw _ Fw) _e(w)-> 
y(w)—9(¢) yl(w)—b y(w)—Y(¢)’ 

e(w)-b _ f,  o(Q-b\'_ A (vQ-s\” 
p(w) 9 ~ ( i) d. (55) : 


m=0 


By changing the order of integration we obtain 


roy-toream| (Sore E (So=8) «) 


When we calculate the integral over ¢ we only need the factors depending on 


z e(z) 
m m —b ‘nee 
[e@wo-yra= | (eo 1” ag(¢) = FO— 
a (a) 
(we have taken into account that y(a) — b = 0). 
Thus, 
7 oo (y( _ ie 1 
HS a a lGan-5" ae 
_ , _ vw) 
nsider a function w(w) such that ie 
(w) wa 
w-a 
vw) = (1) 


For this function 7)(w), we have 


1 fi(w)dw i fi'(w) (bw) dw 


ami J (p(w) =)" Pri 
x an 
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Indeed, 
co 
1 
f'(w) ww)" = S70 ca(w — a), 
k=0 


where i. dk 
a aor (f@) Ow))*) 


The integral we are interested in is equal to c, — the coefficient of (w—a)~! 


Ch = 


W=a 


in the series $*> c,y(w —a)*-™-}. 
k=0 
As aresult, we obtain the following expansion of f(z) into powers of y(z) — 
b: 
=. (v(z) 8)" at 
(oo. 


n=1 


(7'(w) (w(w))") (2) 


? 
w=a 


where 4(w) is given by formula (1). The series (2) is called Biirmann’s series. 
Birmann obtained it in 1799 while generalizing a series Lagrange obtained 
in 1770. The Lagrange series can be obtained from Biirmann’s series for 


p(z) = ree where h(z) is a function. In this case b = y(a) = 0 and 
z 


We) = 5a) 
Therefore se - 
fe) = Fa) + = SL (F(@) (n(a))"), 
n=0 


where s = y(z). In particular, 


oo gm qn-l 
- — -—— (h(a))”. 3 
eat TT Goer le) (3) 
Thus, if the series (3) converges, it enables one to calculate the roots of the 
equation 

z=a-+s h(z). 


1 
Example. Let h(z) = —. In this case the series (3) has the form 
z 


n—1)!a?r-1 


snot yee (4) 


2 
: a : : : 
Series (4) converges for |s| < ote The equation under consideration, 


Z=at+-, 
z 
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a / 4s? a / 4s? 


The series (4) represents only the first of these roots. 


has two roots 


1.5.2 Lagrange’s series and estimation of roots 


Lagrange’s series enables one in certain cases to estimate the roots of polyno- 
mials. Consider, for example, the polynomial 


f(z) = ao + a1(z — c) + ag(z — cc)? +--+ + ay (z —c)*. 
The equation f(z) = 0 can be expressed in the form 
z=c+s hz), 


1 
where s = —— and A(z) = ag + Go(z —c)* +. a3(z — 0)? +--+ taz(z —)*. 
a 


1 
Lagrange’s series for this equation is of the form 


In our case 
n Vo V2 VE nl 2v2e+---+kv 
= ( : 2+ ke 
A'(z) = ) Ag’ Gy” -...- a," ——~———— (z — ¢) ; 
Yo: V2-° ‘Vk: 
Yotvet +, =Nn 
and hence 
d’} (n — 1)! 
—$—— h” vA — a he 9 Bie ectole ave 1 
dz?! CH) ame pr rr eee mee oe 0 
where the sum runs over the collections {v,v2,...,V%} such that 


Voto t--:-+y, =n, QWo+---+kyp=n—1. 


These relations are equivalent to the relations 


n—-1L=22y4+---+kyp, vo =V9+2v3+---+(k-Wy +1. 


. 1 : 
Since s = ——, we obtain 
ay 


ao (Q2v2 +--+ +kvz)! [  apae - ora, . 
z=c—-— )y ieee , (2) 
ay Vo! Vo!-...+ Vy! (—a)? (—a1)* 
where V9 = 42 + 2v3 +++: + (kK-1)y, +1. 


If the series (2) converges, the number z so determined is one of the roots 
of the equation f(z) = 0. 
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Theorem 1.5.1 ([Be3]). Let |ao| + |a2| + +--+ |ax%| < lai]. Then the series 
(2) converges and the root z determined by the series satisfies 


1 
|z—ce|<—In G-4+ | (]ao| + Jaa] +++ lanl) 
1 


Proof. Formula (1) implies that 
1 qv-1 


nt dzn-l 


1 
S = (lao| + Jaz] +--+ + Jarl)”. 


(h"(2)) 2c 


Hence 


[re 


[oe) 
a 
Jee < > La + faa] +--+ lal)” = 
n=1 


la1| 


1 
—In (1 — — (|ao| + |a2| a jeu!) 7 


1.6 Problems to Chapter 1 


1.1 Prove that a polynomial f(z) is divisible by f’(a) if and only if f(a) = 
ao(a — xo)”. 


1.2 Prove that the polynomial 


ag +ayx"™? + agx"? +--+ +ay0 
has at most n positive roots. 
1.3 [Newton] Prove that if all the roots of the polynomial 
P(x) = age" + aja") +++ +a, 
with real coefficients are real and distinct, then 


9. n—-itl i+1 
> ——-: 


a; 


Qj—-1Qi+1 for i= 1,2,...,n—- 1. 


n—-1 } 
1.4 Prove that the polynomial 
aya™ + aga”? + bse + ayn” 


has no nonzero roots of multiplicity greater than n — 1. 


1.5 Find the number of real roots of the following polynomials: 
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1.6 Let 0 = mo < mj, < +--+ < my, and m; = i (mod 2). Prove that the 
polynomial 
ag + ayn"? + aga"? +--+ +ay,20" 


has at most n real roots. 


1.7 Let x9 be a root of the polynomial x” + a;2"~!+---+ay. Prove that for 
any € > 0 there exists a 0 > 0 such that if ja; — ai| < 6 fori=1,...,n, then 
the polynomial x” + aix"-1+---+a!, has a root rg such that |zo — 26| < e. 


1.8 Let the numbers a1,...,@,, be distinct and let the numbers }1,..., bn be 
positive. Prove that all the roots of the equation 


b 
s E =x-—c, where cER, 
wv — ak 


are real. 
1.9 Find all the roots of the equation 


(c?-a2+1)? (a? -a+1)° 
g2(x—1)2 —— a®(a— 1)? * 


1.10 Find the number of roots of the polynomial «” + 2” — 1 whose absolute 
values are less than 1. 


1.11 Let f(z) = 2" +a,z""-1+---+ 4p, where a1,...,a@n € C. Prove that 
any root z of f satisfies —G < Rez < a, where a is the only positive root of 
the polynomial 


a” + (Rea)z"* — |aglz"-? —--+» — [a,| 
and £ is the only positive root of the polynomial 
x” — (Re a;)z” * — |ag|2"-? —---— Jay. 
1.12 [Sul] Let f(z) be a polynomial of degree n with complex coefficients. 


Prove that the polynomial F = f- f’. f”-...-f- has at least n +1 distinct 
roots. 


1.7 Solutions of selected problems 


1.3. Set Q(y) = y"P(y~*). The roots of Q(y) are also real and distinct. Hence 
the roots of the quadratic polynomial 


Qr—2)(y) = (t=-2) (3) (n(n l)any” + 2(n 1)an—1y + 2an_2) 
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are real and distinct. Therefore 
(n —1)?a?2_, > 2n(n— 1)anan—2. 


If i =n — 1 the desired inequality is proved. 
Now consider the polynomial 


PO) (x) = boa’ tt + ba" fteeet bjy1 27 + bx + bj_-1. 


Applying to it the inequality already proved we obtain 


2(ii+1 
be > Be abe: 
Since 
bi+4 = (n—i+1)-...-4-3ai41, 
b; = (n—1)-...+3-24;, 
bj-1 = (n-i-1)-...-2-lLaj-4, 
it follows that 
2(i+1 
(Gaia tee item anaes, 


a 


After simplification we obtain the desired inequality. 
1.11. As x grows from 0 to +o0, the function x” + Rea; monotonically 
increases, whereas the function 


laa] 
x 


a a 
88) oe cs a | n| 


+ 
g2 


gn-l 


monotonically decreases. Therefore each of the polynomials considered has 
only one positive root. 
Let f(z) =0 and Rez > a. Then 


ag a3 an 


a+ Rea, < Re(z+a1) < |z+a|=|—+ 5+--+—S]/< 
zZ 2 z 
a a a 
2 Mala, me Jan e aly, a6 Jan 


(the last inequality follows since |z| > Rez > a). On the other hand, by the 


hypothesis 
ac Re a, = |a2| fp... 4 Jan| : 
fa qnr-l 


a contradiction. 
The estimate of Rez from below is obtained as the estimate from above 
of the real part of the root z of (—1)"f(—z). 
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1.12. Let 21,...,2m be the distinct roots of F’, and let y;(r) be the multi- 
plicity of z; as a root of f\”, where r = 0,1,...,n—1. Consider the symmetric 
functions 


; 
sk(r) = does (r)27, (1) 


i.e., sz (r) is the sum of the kth powers of the roots of f‘”). The elementary 
symmetric functions in the roots of f(") will be denoted by o;(r) (fork > n—r 


we set o,(7) = 0). 


It is easy to verify that, if f(z) = > (—1)*a,z"—*, then 


k=0 
Nel n—k n—-k—-r 
f(z) = S\(-1)*ax — 7 — kor 
k=0 
Hence 
our) = oO a OE nary (n kor +), 


Therefore o;,(r) is a polynomial of degree & in r and ox(n) = 0. 
On p. 79, for k > 1, the identity 


O1 1 0 ... O 
202 O1 1 feo A) 
Sk = é 
kox, Ok-1 Ok-2.--- Oj 


is proved. This identity implies, in particular, that s;,(r), where k > 0, can 
be represented as a linear combination of expressions op, (1) --- ox, (7), where 
ky +:-:+kp =k, and the coefficients of this linear combination do not depend 
on r. Therefore, if k > 1, then s;,(r) is a polynomial in r of degree not greater 


than k. It is also clear that so(r) = So pj (r) = n—r and s;(n) = 0 for all 
j=l 


k>0. 

Consider the relation (1) for k = 0,1,...,m—1 as a system of linear equa- 
tions for unknowns j1;(7), where j = 1,...,m. By the hypothesis, the numbers 
Z1,---;2m are distinct, and therefore the determinant of the system consid- 


ered does not vanish (this determinant is a it Vandermond determinant, see 
[Pr1]). Having solved this system of linear equations via Cramer’s algorithm 
we obtain a representation of u;(r) in the form of a linear combination of the 


sz(r), where k = 0,...,m—1, with coefficients independent of r. Hence p;(r) 
is a polynomial in r of degree dj; < m-— 1. Since s,(n) = 0 for all k, we have 
Lj(n) = 0. 


Let the number of distinct roots of F' be strictly less than n+ 1, ie., 
m<n+1. Then dj <m—1 <n, ie., uj;(r) is a polynomial in r of degree 
<n-—1.In this case 
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deg A*y;(r) = py(r +1) — g(r) Sn —2, 
deg A? u;(r) = Aly; (r +1) — Alyj(r) <n—-3,..., 


A”~'y,;(r) is a constant, and A"y1;(r) is identically zero. In particular, 


4") = 1-"(")n(r) =0. 


r 
r=0 


To arrive at a contradiction, it suffices to show that A”y1(0) 4 0. 

Consider the convex hull of the roots of f. By the Gauss-Lucas theorem 
(Theorem 1.2.1 on p. 13), this convex hull coincides with the convex hull of 
the points 21,...,2m. We may assume that z 1 is a vertex of the convex hull of 
the roots of f. Then z, lies outside the convex hull of the points z2,...,Zm. 
Let = 141(0) be the multiplicity of z; as of a root of f. Then for0 <r < p—-1 
the number 2; is a root of multiplicity wp — r of f( and f(z) 4 0. The 
convex hull of the roots of f) does not contain z;, and hence f)(z1) 4 0 
for r > p. Therefore 

w—r for0O<r<p-I; 
mi(r) = 
0 for r > pL. 
It is also clear that w <n—1, since f has at least one root distinct from 21. 
Hence 
0 for0O<r<n-1lrAp-l,; 


A? = 
mal”) ‘ for r=p-1. 


Therefore, for n > 2, we obtain 


A" u1(0) = A"? (Ay) (0) = Fea(" *V atin(r) =(ayet(" 2), 


r=0 u a 1 


and, for n = 2, we obtain uw = 1 and A?y11(0) = 1. In both cases A" 11 (0) ¥ 0, 
as was required. 


2 


Irreducible Polynomials 


2.1 Main properties of irreducible polynomials 


2.1.1 Factorization of polynomials into irreducible factors 


Let f and g be polynomials in one variable with coefficients from a field k. We 
say that f is divisible by g if f = gh, where h is a polynomial (with coefficients 
in k). 

The polynomial d is called a common divisor of f and g if both f and 
g are divisible by d. The common divisor d of f and g is called the greatest 
common divisor if it is divisible by any common divisor of f and g. Clearly, 
the greatest common divisor is defined uniquely up to multiplication by a 
nonzero element of k. 

One can find the greatest common divisor d = (f,g) of f and g with the 
help of the following Euclid’s algorithm. Suppose, for the sake of definiteness, 
that deg f > degg. Let r1 be the remainder after division of f by g, let 
rg be the remainder after division of g by r1, and generally let rz41 be the 
remainder after division of r,_; by rz. Since the degrees of the polynomials 
r; strictly decrease, it follows that, for some n, we have rp,+1 = 0, i-e., Tn—1 is 
divisible by r,,. We see that both f and g are also divisible by r,, because r,, 
divides all the polynomials r,—1, Tn—2, .... Moreover, if f and g are divisible 
by a polynomial h, then r, is also divisible by h since h divides r1, ro, .... 
Therefore r, = (f,g). 

Euclid’s algorithm directly implies important corollaries which we formu- 
late as a separate theorem. 


Theorem 2.1.1. a) If d is the greatest common divisor of polynomials f and 
g, then there exist polynomials a and b such that d= af + bg. 

b) Let f and g be polynomials over the field k C K. If f and g have a 
non-trivial common divisor over K, they have a nontrivial common divisor 
over k also. 


V.V. Prasolov, Polynomials, Algorithms and Computation in Mathematics 11, 47 
DOI 10.1007/978-3-642-03980-5_2, © Springer-Verlag Berlin Heidelberg 2010 


48 2 Irreducible Polynomials 


A polynomial f with coefficients from a ring k is called reducible over k 
if f = gh, where g and h are polynomials of positive degree with coefficients 
from k. Otherwise f is called irreducible over k. 

Let f = fi-...- fs be a factorization of a polynomial f over a field k into 
factors f1,..., fs; which are polynomials over k. From the factorization into 
the product of factors with arbitrary coefficients we can pass to factorization 
into monic polynomials. Indeed, if f;(x) = ajxz’ +--+ is a polynomial over the 


field k, then g; = “ is a monic polynomial over k. Hence, we can replace 
a: 


the factorization f = fi-...-fs by the factorization f = agi-...- gs, where 
a=a,-...:ds. We will not distinguish two factorizations of such a form that 
differ only by the order of factors. 


Theorem 2.1.2. Let k be a field. Then the polynomial f € k[{a] can be fac- 
torized into irreducible factors and this factorization is unique. 


Proof. The existence of the factorization is easy to prove by induction on 
n = deg f. First of all, observe that, for irreducible f, the desired factorization 
consists of f itself. 

For n = 1, the polynomial f is irreducible. Let the factorization exist for 
any polynomial of degree < n and let deg f = n. We may assume that f is 
reducible, i.e., f = gh, where deg g < n and degh < n. But the factorizations 
for g and h exist by the induction hypothesis. 

Let us prove now the uniqueness of the factorization. Let ag, -...- gs = 
bhy:...-hz, where a,b € k and gi,...,9s,h1,..., hy are irreducible monic poly- 
nomials over k;. Clearly, in this case a = b. The polynomial gj --- gs is divisible 
by the irreducible polynomial h,. This means that one of the polynomials 
91,--+9s 18 divisible by h,. To see this, it suffices to prove the following aux- 
iliary statement. 


Lemma. [f the polynomial qr is divisible by an irreducible polynomial p, 
then either q or r is divisible by p. 


Proof. Suppose that qg is not divisible by p. Then (p,q) = 1, ie., there 
exist polynomials a and b such that ap+ bq = 1. Having multiplied both sides 
of this identity by r we get apr + bgr = r. But pr and qr are divisible by p, 
and so r is also divisible by p. 


For definiteness, let gi be divisible by h;. Taking into account that g; and 
hy are monic irreducible polynomials, we deduce that g, = h;. Let us simplify 


the equality g,-...-gs =h ,-...- hy having divided by g; = h1. After several 
such operations we deduce that s = t and gi = hi, ..., gs = hi,, where 
{i1,...,%s} is a permutation of the set {1,...,s}. 


Irreducibility of polynomials over the ring of integers Z is defined exactly 
as for polynomials over fields, i.e., f € Z[a] is irreducible over Z if it cannot 
be represented as a product of polynomials of positive degree with integer 
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coefficients. But when the coefficients of the polynomial belong to a ring one 
cannot always divide the coefficients by the highest coefficient; one can only 
divide the coefficients by the greatest common divisor of all the coefficients. 
This complication leads to the following definition. Let f(a) = S~ ajx’, where 
a; € Z. The greatest common divisor of the coefficients ao,...,@y is called 
the content of f and denoted by cont(f). Clearly, f(a) = cont(f)g(x), where 
g is a polynomial over Z with content 1. 


Lemma. cont(fg) = cont(f) cont(g). 


Proof. It suffices to consider the case where cont(f) = cont(g) = 1. In- 
deed, the coefficients of the polynomials f and g can be divided by cont(f) 
and cont(g) respectively. 

Let f(z) = YSaa’, g(x) = Yba*, fo(z) = do cx*. Suppose that 
cont(fg) = d > 1 and p is a prime divisor of d. Then all the coefficients 
of fg are divisible by p whereas f and g have coefficients not divisible by p. 
Let a, be the first coefficient of f not divisible by p, and b, the first coefficient 
of g not divisible by p. Then 


Cr+s = arbs + Gy+410s—1 + Gy+205—2 Sa Gp—1b544 + Gy—2bs42 apts 
=a,b, #0 (mod p), 


since 


bs 1 bs 2 ate bo 0 (mod P); 


Gp Sth SS ag = 0 Cod p), 


Thus we have a contradiction. 


Corollary. A polynomial with integer coefficients is irreducible over Z if 
and only if it is irreducible over Q. 


Proof. Let f € Zia] and f = gh, where g,h € Q{az]. We may assume 
that cont(f) = 1. For g, select a positive integer m such that mg € Za]. 
Let n = cont(mg). Then the rational r = ™ is such that rg € Z{a] and 


cont(rg) = 1. Similarly, select a positive fGenal number s for h. Let us show 
that in this case rs = 1, i.e., the factorization f = (rg)(sh) is a factorization 
over Z. Indeed, thanks to Gauss’s lemma, cont(rg) cont(sh) = cont(rsgh), i.e., 
1 =cont(rsf). Since cont(f) = 1 we deduce that rs = 1. 


Kronecker suggested the following algorithm for factorization of any poly- 
nomial f € Z[z] into irreducible factors (Kronecker’s algorithm). Let deg f = 


n and r= [5 . If f(x) is reducible, it has a divisor g(a) of degree not higher 


than r. 

To find this divisor g(x), consider the numbers c; = f(j), where j = 0, 
1,...,r. If c; = 0, then « — j divides f(x). If on the contrary c; 4 0, then 
g(j) divides c;. To every set do,...,d, of divisors of the numbers co,..., cr, 


50 2 Irreducible Polynomials 


respectively, there corresponds precisely one polynomial g(a), of degree not 
higher than r, such that g(j) = d, for 7 =0,1,...,r. Namely, 


: x—k 
aa) = Sedna), where ae)= TT (==>). 


O<k<r,kAj 


For each such polynomial, one has to verify if its coefficients are integers and 
if it actually divides f(a). 

Other, more effective, algorithms for factorization of polynomials into ir- 
reducible factors are given below (see p. 71-73 and 279-288). 


2.1.2 Ejisenstein’s criterion 


One of the best known irreducibility criteria of polynomials is the following 
Eisenstein’s criterion. 


Theorem 2.1.3 (Eisenstein’s criterion). Let f(x) = a9 +a1¢+---+an2” 
be a polynomial with integer coefficients such that the coefficient ay, is not 
divisible by a prime p, while the coefficients ao,...,G@n—1 are divisible by p but 
ao is not divisible by p?. Then f is irreducible over Z. 


Proof. Suppose that 


f=gh= ~ bpar*) (~~ cia) . 


where g and h are polynomials of positive degree with integer coefficients. The 
number boco = ao is divisible by p, and hence one of the numbers bo and co is 
divisible by p. Let, for the sake of definiteness, bo be divisible by p. Then co 
is not divisible by p since ag = boco is not divisible by p?. If all the numbers 
b; are divisible by p, then so is a,. Therefore b; is not divisible by p for some 
i, where 0 < i < degg < n; we may assume that 7 is the least index of the 
numbers b; not divisible by p. 

On the one hand, by assumption the number a; is divisible by p. On 
the other hand, a; = bjco + bj;-1¢1 + --- + boc;, where all the summands 
bj-1€1,..-,b9c¢; are divisible by p while b;cp is not divisible by p: a contradic- 
tion. 


Example 2.1.4. Let p be a prime and let q be not divisible by p. Then x” — pq 
is irreducible over Z. 


Example 2.1.5. If p is a prime, then f(x) = 2?-14 a?-24---+2+4+1 is 
irreducible. 


Indeed, one can apply Eisenstein’s criterion to the polynomial 


foot 1) = GEE aor ts (Marta ing (oP). 
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Example 2.1.6. For any positive integer n, the polynomial 
ip? 


g™ 
I pe 


f(a) =1404 
is irreducible. 
Proof. We have to prove that the polynomial 


alf(z) = 2" + na”! + n(n —1)2"? +--- +n! 


is irreducible over Z. To this end, it suffices to find the prime p such that n! 
is divisible by p but is not divisible by p?, ie., p <n < 2p. 

Let n = 2m or n = 2m +1. Bertrand’s postulate (for its proof, see, e.g., 
[Ch1]) states that 


there exists a prime p such that m < p< 2m. 


For n = 2m, the inequalities p < n < 2p are obvious. For n = 2m+1, we 
obtain the inequalities p <n—1 and n—1 < 2p. But in this case the number 
n — 1 is even, and hence the inequality n — 1 < 2p implies n < 2p. It is also 
clear that p< n—-—1l<n. 


2.1.3 Irreducibility modulo p 


Let F,, be the residue field modulo p. Every polynomial with integer coefficients 
can be also considered as a polynomial with coefficients from F,. A polynomial 
irreducible over Z can become reducible over IF, for all p, and the construction 
of an example to show this is based on the following theorem. 


Theorem 2.1.7. The polynomial P(x) = x* + ax? + b?, where a,b € Z, is 
reducible over F, for all primes p. 


Proof. For p = 2, there are only 4 polynomials of the form indicated, 
namely, 


a4 


, @+e2Se2(o2+1), 2t+1=(@41)7, 2t4+2°4+1= (227 4+241). 
All these polynomials are reducible. 

Let p be an odd prime. Then we can select an integer s such that a = 2s 
(mod p). We have 


P(a) = a* +027 +0? = (2? + 8)" — (s? — 8”) 
= (2? +6)? —(2b-2s)2" = 
= (x? — b)? —(—2b—2s)z? (mod p). 


wll 


Thus it suffices to prove that one of the numbers s? — b?, 2b— 2s, —2b—2s is 
a quadratic residue modulo p. 
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Let us recall the basic notions of the theory of quadratic residues. Under 
the map x +> x? the elements x and — turn into the same element. Therefore 
p-l 


the image of the set of nonzero elements of F,, under this map consists of 
elements. On the other hand, if « = y?, then 2?—))/? = yP-! = 1, ie., all 


the = 


1 
elements of the image satisfy the equation 2-))/? = 1, which 


solutions. The elements that do not lie in the 


Pp 
cannot have more than 


image of the map z + 2? satisfy the equation 2?—))/2 = —1. Therefore, if 


two integers are not perfect squares modulo p, then their product is a perfect 
square modulo p. 

Suppose that 2b — 2s and —2b— 2s are not squares modulo p. Then their 
product 4(s? — b?) is a square modulo p, and hence so is s? — b?. 


Example 1. The polynomial x* + 1 is irreducible over Z but reducible 
modulo p for all primes p. 


Proof. It suffices to prove that x* +1 is irreducible over Z. The roots of 
7 


this polynomial are . Every polynomial with real coefficients can have 


non-real roots only if they occur in complex conjugate pairs. Therefore the 
only nontrivial real divisors of x* +1 are the polynomials x? + /2x +1 whose 


SE fg =baet 


and 


roots are 


. Both these polynomials do not lie in Z[z]. 


Example 2. Let c € N and Ve ¢ Q. Then the polynomial P(x) = 2+ + 
2(1 — c)x? + (1 +c)? is irreducible over Z but reducible modulo any prime p. 


Proof. It suffices to prove that P(a) is irreducible over Z. It is easy 


to verify that the roots of P are equal to +\/—l+c+t2ive = ti+ Ve. 


Combining the complex conjugate roots into pairs we obtain the polynomials 
x? +2,/cx + 1+. These polynomials do not lie in Z[z]. 


2.2 Irreducibility criteria 
2.2.1 Dumas’s criterion 


n 

Let p be a fixed prime, and let f(x) = )> Aja’ be a polynomial with integer 
coefficients such that Aj A, 4 0. Let Tae represent the nonzero coefficients 
of f in the form A; = a;p%, where a; is an integer not divisible by p. To 
every nonzero coefficient a;p we assign a point in the plane with coordinates 
(i,a;). These points give rise to the Newton diagram of the polynomial f 
(corresponding to p). The construction of the diagram is as follows. 

Let Po = (0, a0) and P; = (i1, a:,), where 7; is the largest integer for which 
there are no points (i,a;) below the line PoP;. Further, let P2 = (i2,ai,), 
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where iz is the largest integer for which there are no points (7,a;) below the 
line P; P2, etc. (fig. 2.1). The very last segment is of the form P,_;P,, where 
P,. = (n, a). If some segments of the broken line Py... P,. pass through points 
with integer coordinates, then such points will be also considered as vertices 
of the broken line. In this way, to the vertices Po,..., P,, we add s > 0 more 
vertices. The resulting broken line Qo...Q,+s is called the Newton diagram 
(here Qo = Po and Q,+s, = P,-). The segments P;Pj41 and Q;Qi+1 will be 
called sides and segments of the Newton diagram respectively, and the vectors 
QiQi+1 will be called the vectors of the segments of the Newton diagram. 


P2=Q2 Q3 P3=Qa 


FIGURE 2.1 


Consider the system of vectors of the segments for the Newton diagram, 
taking each vector with its multiplicity, i.e., as many times as it enters the set 
of vectors of segments. 


Theorem 2.2.1 (Dumas, [Du2]). Let f = gh, where f, g, and h are poly- 
nomials with integer coefficients. Then the system of vectors of the segments 
for f is the union of the systems of vectors of the segments for g and h (pro- 
vided p is the same for all the polynomials). 


Proof. [W] Let 
fle) = raps’, gle) = Srjp%2, h(a) = Vo as", 
i=0 j=0 k=0 


where the numbers a;,b;,c, are not divisible by p. Take a side of the Newton 
diagram for f (recall that a side P,P); may consist of several segments of 
the Newton diagram). Let the coordinates of P; and P41 be (i_,a;_) and 
(i, a4, ) respectively. The slope of P,Pi+1 is 


Ai, — A_ 
M = —— 


14 —t_ 
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Let aj, — aj;_ = At and i, —i_ = It, where t > 0 is the greatest common 
divisor of a;, — aj_ and i, —i_. Then M = A/I, where (A, I) = 1. 
The side P; P11 of the Newton diagram belongs to the straight line 


Ia—Ai=F, where F = Ia;, — Ai, = Ia;_ — At_. 


By assumption all the points (7, a;), where i = 0,1,...,7, lie on or above this 
line, ie., Ja; — Ai > F’, where this inequality is strict for i <i_ andi>i,. 
The number Ia; — Ai will be called the weight of the monomial ap“«’, where 
(a,p) = 1. The numbers i_ and i; are uniquely determined as the least and 
the greatest exponents of the power of x for the monomials entering f with 
the minimal weight. 

For the polynomial g, consider the quantity 


=Q,..., 


and define 7_ and j as the least and the greatest indices for which 
G= If; —Aj- = 18;, — Aj,. 
Similarly, for the polynomial h, consider the quantity 


> pt a 7 ANY, 


nada 


and define k_ and k, as the least and the greatest indices for which 
A= TK —Ak_ = IK, = Ak... 
Clearly, 


pg pe = s (bjp*ia1)(cyp*x*). 
jtk=j_+k— 


The weight of the product of two terms is equal to the sum of their weights, 
and therefore the weight of the summand with j = j_ and k = k_ is equal 
to G+ H. The weights of all the other summands are strictly greater than 
G+ H since, for them, either 7 < j- ork < k_. 

Indeed, let, for example, 7 < j-. Then the weight of bj ps x) is strictly 
greater than G and the weight of cpp?" is not less than H. 

The weight of (b;p°i27)(cpp% x") for 7-+k = const increases monotonically 
as 3; + Yr grows since I > 0. In the case considered, 7 + k = j- + k_, and 
therefore the sum 6; + yz is strictly minimal at 7 = j_- and k = k_. Therefore 
the weight of a;_4,_p°’-**- is equal to G+ H. 

It is also clear that for i < j7_ +k_ the weight of a;p“2’ is strictly greater 
than G+ H, whereas for i > j_ + k_ the weight of ajp“z* is not less than 
G+ H. Therefore G+ H = F and j- + k_ = i_. We similarly prove that 
J+ + ky => a4. Thus, 
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t¢—1_ = G4—9-) 4+ (hy —F),. (1) 


In particular, one of the numbers j, — j7- and k4 — k_ is nonzero. 

If both the numbers j4 — j- and k, — k_ are nonzero, then the segment 
with the end points (j_,3;_) and (+, 3;,) is a side of the Newton diagram 
for g and the segment with the end points (k_,7_) and (k+,Yx,) is a side of 
the Newton diagram for h. The slope of both segments is equal to M = A/I 
since 


Bj, ~ Bj. A _ Yee — Ve- 
fag 2 kak 
Relation (1) shows that the sum of the lengths of the sides with slope M of 
the Newton diagrams for g and h is equal to the length of the side with the 
same slope M of the Newton diagram for f. 

If one of the numbers j, — j- and ky — k_ vanishes, then the Newton 
diagram of one of the polynomials g or h has a side with slope M and its 
length is equal to the length of the side of the Newton diagram for f, whereas 
the Newton diagram of the other polynomial has no side with slope M. 

Thus, the vector of the side with slope M of the Newton diagram for f 
is equal to the sum of the vectors of the sides with the same slope M of the 
Newton diagrams for g and h. Relation (1) shows that if one of the Newton 
diagrams for g and h possesses a side with a certain slope M, then the Newton 
diagram for f should also possess a side with the same slope. 


Corollary. If, for a prime p, the Newton diagram for f consists of pre- 
cisely one segment, t.e., consists of a segment containing no points with integer 
coordinates, then f is irreducible. 


Let us give now three examples of the application of Dumas’s criterion to 
the proof of irreducibility of polynomials. 


Example 2.2.2 (Eisenstein’s criterion). Let f = a9 + ay@ +--+ + ana” bea 
polynomial with integer coefficients such that, for a prime p, the coefficient 
Gy, is not divisible by p, the coefficients ao,...,@,—1 are divisible by p and ao 
is not divisible by p?. Then f is irreducible. 


Proof. The Newton diagram for f consists of one segment with the end 
points (0,1) and (n,0); inside this segment there are no points with integer 
coordinates. 


Example 2.2.3. Let p be a prime, (c,p) = 1 and (m,n) = 1. Then the polyno- 
mial 2” + cp™ is irreducible. 


Proof. The Newton diagram for the polynomial considered is a segment 
with the end points (0,m) and (n,0). Since (m,n) = 1, there are no points 
with integer coordinates inside this segment. 


Example 2.2.4. Let p be a prime. If the polynomial f(x) = x” + px + bp”, 
where (b,p) = 1, has no integer roots, then this polynomial is irreducible. 
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Proof. The Newton diagram for f is the union of the segment with the 
end points (0,2) and (1,1) and the segment with the end points (1,1) and 
(n,0). Inside these segments, there are no points with integer coordinates. 
Therefore the nontrivial factorization of f over Z can consist only of a linear 
factor and a factor of degree n — 1. 


2.2.2 Polynomials with a dominant coefficient 


In certain situations one can ensure that a polynomial with a sufficiently large 
coefficient will necessarily be irreducible. Among criteria of this type, the best 
known one is the following Perron’s criterion. 


Theorem 2.2.5 ([Pe]). Let f(x) = a2" + a;2"~!+--++4n be a polynomial 
with integer coefficients such that an #~ 0. 

a) If jay] > 1+ |ag| +--+ + |an|, then f is irreducible. 

b) If jar| > 1+ |a2|+--++ la,| and f(+1) 40, then f is irreducible. 


Proof. a) Let us prove first that all the roots of f, except precisely one 
root, lie inside the unit disk |z| < 1. Clearly, the polynomial 


g(a) =a" +ai:2""1 


satisfies this property, i.e., all the roots of g, except precisely one root, lie 
inside the unit disk |z| < 1. Hence by Rouché’s theorem (see p. 1) it suffices 
to prove that for |z| = 1 we have 


f(z) — a(2)| < |F(@)| + |9)]- 
But for |z| = 1 we have, on the one hand, 
| F(z) — 9(z)| = |azz"™F +--+ tan] S Jaa| +--+ t+ lan] <lal-1, (2) 
and, on the other hand, 
|f(z)| + |o(2)| = |o(z)| = le” + a2"7"| = |z + | > Jar| - 1. (2) 


Suppose now that, on the contrary, that f can be represented as the prod- 
uct of polynomials f; and f2 of positive degree with integer coefficients. The 
product of the roots of each of the polynomials f; and f is a non-zero integer, 
and therefore each of these polynomials has a root whose absolute value is not 
less than 1. But f has only one such root, and we have a contradiction. 

b) If jai] = 14+ |a2|+--+-+]an|, then inequality (1) becomes non-strict. But 
if f(+1) 4 0, inequality (2) becomes strict. Indeed, for |z| = 1 the equality 


| f(2)| + |9(2)| = lanl -1 


is only possible when simultaneously | f(z)| = 0 and |z + ai| = |ai| — 1. The 
latter equality can only hold if z € R. Since |z| = 1, it follows that z = +1. 
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Theorem 2.2.6 ({Br]). Let a1 > ag > --: > ay be positive integers and 


n > 2. Then the polynomial p(x) = x” — aya"! — aga"? — --- — an is 
irreducible over Z. 


Proof. Consider the polynomial f(x) = (a — 1)p(z). Clearly, 
f(a) = grtl = bya” ab bor”! th egal Basta: 


where b} = a1 + 1, bg = ay — aa,..-, bn = Gn—1 — An, bn41 = Gn. The numbers 
by,...,bn41 are positive integers and by = 1+62+---+bn41. Therefore f(x) 
satisfies one of the conditions of Theorem 2.2.5 (b). But it does not satisfy the 
second condition since f(1) = 0. We must therefore apply a subtler argument. 
Let 
h(z) = byz” = boz™ 1! a= ae? On+1- 


First we show that, for all sufficiently small ¢ > 0, we have 
|a(z)| > [2"**| = |F(z) + Ale)| 


everywhere on the circle |z| = 1 + ¢. Indeed, if |z| = 1+, then 


|A(z)| — |2"*"| > b1(1 +e)” — bo(1 te)" * —- ++ — bag — (1+ e)"** = 
=e (by + 263 +--+ (n— lbp + nbngi —1) +-°- 


The coefficient of € is positive, and therefore, for sufficiently small ¢ > 0, 
we have |h(z)| —|z"*| > 0. In this case 


| F(z) + A(z) = le") < [h(2)| < |F(@)| + [hO)- 


Therefore, by Rouché’s theorem, the polynomial f(z) has as many roots inside 
the disk |z| < 1+ as h(z) does. But all the roots of h(z) lie strictly inside 
the unit disk |z| < 1. Indeed, if |z| > 1, then 


|A(z)| > by|z2|” = balz|"-+ SS bn+1 > 
> \z|" (by _ bo See i bn+1) = |z|” > 0. 


Letting ¢« —> 0, we see that inside and on the boundary of the unit disk there 
are exactly n roots of the polynomial f(x) = (#—1)p(a). Hence exactly n— 1 
roots of p(x) lie inside the unit disk and at least one of its roots lies outside 
it. Hence p is irreducible. 


A criterion similar to Perron’s criterion but with a condition on the con- 
stant term instead of a, also holds. It holds, however, only if the constant 
term is a prime. 
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Theorem 2.2.7 ({Os1]). Let f(z) = a” + aja"! +---+an-12 +p be a 
polynomial with integer coefficients, where p is a prime. 

a) Ifp> 1+ lay)+---+|an—i|, then f is irreducible. 

b) Ifp >1+]ai]+-+-+|an—1| and among the roots of f there are no roots 
of unity, then f is irreducible. 


Proof. Suppose that f(#%) = g(x)h(x), where g and h are polynomials of 
positive degree with integer coefficients. The product of the constant terms of 
g and h is equal to +p. Since p is prime, one of these constant terms is equal 
to +1. Therefore the product of the absolute values of the roots of one of the 
polynomials g and h is equal to 1. This polynomial must therefore possess a 
root a such that |a| < 1. Since a is also a root of f, it follows from f(a) = 0 
that 


p= la" +a0" 1 +++ + aq_10| < 1+ |ay| +--+ + lan—il. 


In case a) we arrive at a contradiction. 
In case b), @ is not a root of unity. Hence |a| < 1, and therefore 


p<1l+jai|+---+ lan-1|. 


A contradiction again. 


2.2.3 Irreducibility of polynomials attaining small values 


Theorem 2.2.8 (Pélya). Let f be a polynomial of degree n with integer 
coefficients and define m = Lowe Suppose that, for n different integers 
Q1,.--,@n, we have | f(a:)| <27™m! and the numbers a,,...,Qn, are not roots 


of f. Then f is irreducible. 
Proof. We will need the following auxiliary statement. 


Lemma. Let g be a polynomial of degree k with integer coefficients, and 
let dg < dy <--- < dy be integers. Then |9(di)| > k!2-* for some i. 


Proof. Consider the polynomial 


k 
Glo) = (© = d= (e =a) > AO TT 


©. 
ll 
oO 
s 
&. 
th 
oS. 
s 
S 


It is easy to see that G(d;) = g(d;) for 1 = 0,...,k, and degG < k. Hence 
G(a) = g(a). 


The highest coefficient of G is equal to 
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By assumption this coefficient is a nonzero integer, and hence its absolute 
value is > 1. Therefore one of the numbers | g(di)| is not less than 


1 1 
os i 
th |di—d;| ke. [tJ 
O<i<k jAi O<i<kh jAi 
1 k! k! 
So ay 
LL mam >» G) 
0<i<k O<i<k 


Returning to the proof of the theorem, we suppose on the contrary that 
f = gh, where g and h are polynomials with integer coefficients. We may 
assume that degh < degg = k. Then m < k < n. Clearly, g(a;) 4 0, and 
g(a;) divides f(a;). Hence 


On the other hand, by Pélya’s lemma, we have |g(ai)| > 2-k! for one of 
the a; (we apply Polya’s lemma to d; = a;_1). It remains to notice that, since 
k>™m, it follows that 2-*k! > 2-™ml. Indeed, ifm —=k+r, 

m! ee 


Fy = (R41) (R42)-... +r) Sa" = SS. 


Example. The polynomial (a — 1)-(a—2)-...-(a@—n) +1 is irreducible. 


For other irreducibility criteria for polynomials attaining small values, see 
[Tv]. 


2.3 Irreducibility of trinomials and fournomials 


2.3.1 Irreducibility of polynomials of the form a” +a2"+a2?+1 


Let f(x) =a" +612" + €9u? + €3, where n >m > p> 1 ande¢; = +1. Let us 
find out following [Lj] when f is irreducible. We first show that it suffices to 
consider the case where m+ p > n. Clearly, f is irreducible if and only if the 
polynomial 


1 
x" f (=) =lteya”™™ + e092" ? + e320” 
L 


is irreducible and then, if m+ p <n, we have (n—m)+(n—p) >n. 

We may also exclude from further considerations the trivial case f(x) = 
(a + €2)(a? + €1), ie., when n = m+ pand €3 = €1€9. 

The polynomial y(x) of degree s is said to be recursive if 


=e (=) . 
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Lemma 2.3.1. Let f(x) = y(x)w(x), where v(x) and w(x) are monic poly- 
nomials of positive degree with integer coefficients. Then at least one of the 
polynomials y(a) and w(a) is recursive. 


Proof. Let r = deg y and s = n—r = degw. Consider the polynomials 


1 n 
hte) =a" (2) ve) = Dae, 
1=0 
1 1 ” 
fr(o) =2°¥ (=) ote) =0"A (2) = Dee 
i=0 
Clearly, 
fi(e)fala) = 2" (2) = 
(a” t eya"™ t e9xP t €3)(€3@" + ega" P + equ" ™ 4-1). 
Comparison of the coefficients of x?” shows that co¢n = €3, and hence cg = £1 
and c, = +1. Comparison of the coefficients of x” shows that 
ote te +O 14C=l+eftept+e} =4, 
iy eh eg = 2 Thus, = £1, ¢, = 21,6) = £1 and cg = +1 


for some 1 < a < @ < n—1, all the other coefficients c; being zero. Hence 
fi(x) fo(x) can be expressed in the following two forms: 


COCn U2” + Ca ln tr % + egeaP + E9Cg ur tO+ (1) 


egcgt™t? + cgcga™t8—-% 4 dg +... 
and 
esa” + E9n2"—P + eya2r—™ + €ye3art™ + Ege3urtP + Ey eQurtm-Py 
4x” + oa 


(2) 


In order to compare (1) and (2), let us order the monomials with respect to 
the size of the degrees taking into account only the three highest monomials. 
For (1), we obtain the four possibilities: 


BS 5 : In>M-a>m-B, 

Bos: a<n—-68 : 2n>2n-a>n+Q8, 
B>s, 5 2a>n-H : n>n+B>2In-a, 
B> =, a>: n>n+8>nta 
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For (2), we obtain the two possibilities: 


n>2m : 2n>2n-—p>2n-m, 
2m >n>n+p : Wn>Aan-p>n+m. 


Comparing the three highest monomials in (1) and (2) we obtain for the pair 
(a, 3) the following four possibilities: 


(a, 3) = (p, m), (p, n— m), (m,n :- Pp) or (n — m,n — Dp). 
If (a, 3) = (p,m), comparison of (1) with (2) shows that 
CoCn = €3, CpCn = £2, CmCn = &1- 


Hence 


1 
fi(z) — Cn(e3u” teqr” P +eya™ M+ 1) = Cie ft (=) , 
x 


1 
Therefore (x) = c,a*w (- 
x 
If (a, 8) = (n— m,n — p), we similarly see that 
CoCn = €3, COCn—m = £1, COCn—p = €&2. 


Hence 


fix) = co(z” + e12™ + Eg? + €3) = cof (z). 


1 
Therefore v(x) = cox’ y (=). 
x 


If (a, 8) = (p,n—m), then we encounter in (1) monomials of degrees 


2n, 2n-—p, n+m, n+p, 2n-—m, 2n-—m—p, n, 


and in (2) monomials of degrees 


2n, 2n-—p, 2n-—m, n+-m, n+p, n+mM—p, n. 


Therefore the number 2n — m — p is equal to one of the three numbers: n+ m, 
n+p,n+m-—p. The equalities 2n —m—-p=n+m and 2n-—m—-p=n+p 
contradict the assumption that n < m-+p. and hence 2n-—m—p=n+m-—p, 
Le., n = 2m. Therefore (a, 3) = (p,m). 

If (a, 8) = (m,n-— p), we similarly see that n = 2m, ie., (a, 8) = (n — 
m,n —p). 
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Lemma 2.3.2. Let \ and \"! be the roots of f(x). Then one of the following 
three pairs of conditions hold: 


(1) = —€3 and X"~P = —E1€2, 
(II) X” = —-e,e3 and NXP = —€&9, 


(III) AP = —e2€3 and XY~™ = —e4. 
Proof. The conditions f(A) =0 and f(A!) = 0 can be expressed as 
N+ eyX”" + €QrA? +63 =0, X +e9E3N ? + eyegr” ™ +63 = 0. 
By subtracting one equation from the other one we get 


E963X” P + €1e3X"_™ = eX” = EAP = 0, 


i.e., 

(egX"P + €1)(Eg3X”-™ — e1€2A”) = 0. 
and hence either AP = —e,€2X” or A” = €1€2€3X'’—™ . Substituting these values 
of A? into the relation f(A) = 0 we obtain accordingly either X’ = —e3 or 


(X” + E1€q)(X—™ = €1) = 0. 


With the help of Lemmas 2.3.1 and 2.3.2 it is easy to prove the following 
two theorems which in turn lead to a complete description of irreducible poly- 
nomials of the form #” +¢,2"™ +é2x? +¢3. In both theorems (as well as in Lem- 
mas 2.3.1 and 2.3.2) we assume that n < m-+pand f(x) 4 (a™+€2)(x? +e1). 


Theorem 2.3.3. a) If the polynomial f(x) has no roots which are roots of 
unity, then f(x) is irreducible. 

b) If the polynomial f(a) has exactly q roots which are roots of unity, 
then f(a) can be represented as the product of two polynomials with integer 
coefficients one of which is of degree q and its roots are the given roots of 
unity, while the other polynomial is irreducible. 


Proof. Let f(x) = p(x)w(x), where y, w € Z[z]. By Lemma 2.3.1 we may 
assume that if is a root of yy, then A! is also a root of y. It then follows from 
Lemma 2.3.2 that is a root of unity. If not all the roots of f are the roots 
of unity, then either w is irreducible over Z or w = uwW2, where 1, 2 € Z[a] 
and all the roots of w are the roots of unity whereas w2 has a root which is 
not a root of unity. In this case all the roots of yw, are the roots of unity. By 
continuing the same arguments now applied to w2 we obtain the factorization 
desired of f. 


It remains to determine, when f has roots which are roots of unity. The 
answer is given by the following theorem. 
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Theorem 2.3.4. Let d be the greatest common divisor of n, m, p. Set 


n m p 
nhypz=- mn, = — = 

1 d’ af d’ Pi d’ 
dj =(m1,m1—pi), d2=(m1,n1—pr), d3 = (pi,m1 — m1). 


Then any root of unity which is a root of f satisfies one of the equations 


ddy _ 44 


pide — +1, gids — +4 


x 


and it is a simple root of f. 


Proof. Let \ be a root of unity which is a root of f. Then X~+ is also a root 
of f. Lemma 2.3.2 provides three possibilities for conditions on A. Consider, 
e.g., case (I): X” = —e3 and X"-P = —ey€2. Clearly, (n,m — p) = ddy, and 
hence there exist integers u and v such that dd; = nu + (m-— p)v. Therefore 
Nida = (N)*(AV-P)” = +1 since X* = —e3 = +1 and X"-? = —eye3 = +1. 
Cases (II) and (IIT) are similarly considered. 

It remains to prove that A is a simple root of f, ie., 


Af’ (A) = nX* + e1mX” + EgprA” # 0. 


Substituting (I), (II) and (III) into nX + eymX” 4+ egpr”? = 0 we respectively 
obtain 


eQgA?(p—m) =ne3, €2A?(p—n)= meg, €1X"(M— nN) = pees. 


The equality |A] = 1 cannot occur in the first case whereas in the second and 
third ones it means that n = m-+p. If n = m+ p, the relations (II) take 
the form X” = —ey,e3 and X" = —eg whereas relations (III) take the form 
N = —e2€3 and NX = —¢€1. In both cases €3 = €1€2 which corresponds to the 
excluded polynomial f(a) = (a + €2)(x? + €1). 


2.3.2 Irreducibility of certain trinomials 


Making use of the results obtained in the preceding section it is not difficult 
to determine which of the trinomials 7” + 7” + 1 are irreducible. 


Theorem 2.3.5 ([Lj]). Let n > 2m, d = (n,m), m = + ine in = “ 


Then the trinomial 


g(x) =a" + ea™ + e’, wheree=+1 ande’=+1, 


is irreducible except for the three cases in which ny +m, =0 (mod 3): 
a) ny and my, are odd and ¢ = 1. 
b) ny is even and e' = 1. 
c) m, is even and ¢! =e. 
In all these cases g(x) is a product of x24 + 
polynomial. 


meld 41 by an irreducible 
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Proof. The case when n = 2m and e’ = 1 is obvious. We will therefore 
assume that either n = 2m and e’ = —1 or n > 2m. For such n and e’, we can 
apply Theorems 2.3.3 and 2.3.4 to the product 


(2” + ea™ + e')(a” —e') = 0?" + eg™t™ — ce'z™ —1 
since 2n > n+m > m, and if 2n = (n4+-m)+m, ie., n = 2m, then €3 # €1é2. 
In the notation of Theorem 2.3.4 we have 


(2n,n +m,m) = (n,m) = d, 
dy = (2n1, 1) =N1, 
dz = (m1, N41 —m)) = 1 


and dz = (ny + m4, 2n1 — m1) = (n1 + m1, 3n1). Therefore 


i 1 if Ny + my, £0 (mod 3) 
2 )3 ifn. +m, =0 (mod 3). 


By Theorem 2.3.4 the roots of unity which are roots of g satisfy one of the 


equations 7? = +1, x@@2 = 41, 244 = +1. The first of these equations is of 
the form x” = +1, the third one is of the form «? = +1 because d3 = 1. 
If «” = +1, then g(a) = +1+ ca" +1 £ O and, if xg? = +1, then 


g(x) =+41 414140. 
It remains to consider the case when dz; = 3. In Lemma 2.3.2, case (1) 
leads to the relations 


a = +1 and x S21 


whereas case (III) leads to the relations 


XxX” =+1 and N™ = +1. 


In both cases we get X* = +1, and hence g(A) 4 0. Case (II) leads to the 
relations X= 2, ¥"-™ = ee’; 12, KP oe XK = ee". Therefore (4)™ = 
e’ and (#4)™ = ce’, where (n1,m1) = 1 and ny +m, =0 (mod 3). From the 
condition (n1,™mz1) = 1 it follows that nyu + mj v = 1 for some integers u and 
v. Hence 


ed = aniut3dmiv = (2) fee")” = 7. 


If mn, and m are odd, then 84 = ¢’ = ee’, and hence ¢ = 1 and #4 = é’. 
If n, is even, then ¢’ = 1. 
If m, is even, then ce’ = 1. 


By Perron’s criterion (Theorem 2.2.5 on page 56) the trinomial 2” + 
ax”-'+1, where a > 3 is an integer, is irreducible. For a = 2, this trino- 
mial is irreducible if it has no roots equal to +1. All these statements also 
hold for the trinomial 7” + ax + 1. 

Irreducibility of trinomials 7” + 2x™ + 1 is studied in [Sc2]. 

In conclusion we formulate two further theorems on the irreducibility of 


trinomials. 
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Theorem 2.3.6 ({[Mi2]). Let the trinomial x” + pa™ +1, where n > m and 
< 4P", 


p is a prime, be reducible. Then 


n, 


Theorem 2.3.7 ([Ra2]). a) The trinomial x° + «+n factorizes into the 
product of irreducible quadratic and cubic polynomials if and only if n = +1 
orn = +6. 

b) The trinomial «° — «+ n factorizes into the product of irreducible 
quadratic and cubic polynomials if and only if n = +15, n = +22440 or 
n = +2759 640. 


2.4 Hilbert’s irreducibility theorem 


Let f(t,2) € Q[t,2] be a polynomial in two variables. The polynomial f is 
called reducible if f = gh, where g,h € Q{t,a] are polynomials of positive 
degree. 


Theorem 2.4.1 (Hilbert, [Hi3]). Jf f(t,x) is an irreducible polynomial 
over Q, then there exist infinitely many rationals to for which the polynomial 
f(to, x) in one indeterminate is irreducible over Q. 


We give the proof of Hilbert’s theorem due to Dérge [Do2] using the ex- 
position of this proof given in [La3] and [Se2]. A proof using more modern 
language can be found in [FY]. 

Let us start by establishing a relation between reducibility of the poly- 
nomial f(to,#) and existence of point (to, yo) with rational coordinates on a 
certain algebraic curve. Let f(t,x) € Q{t, z] be an irreducible polynomial. Let 
us represent it in the form 


f(t, x) = an(t)a” + +--+ ao(t), where a;(t) € Q{t], 


and we define the polynomial F(x) = f (t,x) with coefficients from the field 
k = Q(t). Let & be the algebraic closure of k. Then 


F(x) =a,(t)-(a—aj)-...+(@— ay), 


where a,...,@n € k are the roots of F(x) which is irreducible over k = Q(t). 
If an(to) # 0 there correspond to them the roots a,...,a/, € Q of f (to, 2). 
Suppose that f(to, x) is reducible over Q. After a renumeration of the roots 
we may assume that f(to, 7) = an(to)go(x)ho(x), where 
go(x) = (w — a4) -...-(@— a) € Qlal, 
ho(z) = (@# — a41) +... (@—ai,) € Q[a]. 
Set 


g(a) = (a —ay)-...-(@—ag) and h(x) = (a — ag41)+-..+ (@— An). 
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Then F(x) = an(t)g(x)h(x), where g(x), h(x) € k[a]. By assumption F(z) 
is irreducible over k[x], and hence g(x) has a coefficient y which belongs to 
k\ k. Recall that & = Q(t), so that y is algebraic over Q(t), i-e., 


bm (t)y™ + ba-ailty™ * ee bo(t) =0, where b;(t) E Q(t). 


As a result we obtain an algebraic curve C' (with rational coefficients) in the 
plane (t, y). To the coefficient y of g there corresponds a coefficient yo € Q of 
the polynomial go and this coefficient satisfies the relation 


bm(to)yo” + bm—1(to)yp” | +++: + bo(to) = 0, 


i.e., C has a rational point (to, yo). 

Thus, consider all polynomials of the form (a —a;,)-...-(a—a4,), where 
1<k<n-—1, and, for each of these polynomials, select the coefficient that 
does not belong to QJt]. To these coefficients we assign plane algebraic curves 
Ci,..., Cy with rational coefficients. If tp € Q is such that none of the curves 
C,,...,C has rational points (to, yo), the polynomial f(to, x) is irreducible. 

Let us now study rational points of the plane algebraic curve C given by 
the equation 


bin(t)y™ + bn_1(t)y™1 + «+--+ bo(t)=0, where 6,(t) € Z(t). 


First we make the change of variable y = b,,(t)y. As a result we obtain the 
curve 


GF” + Pm—1(t)G" 1? + bn—2(t)bm(£)¥" 2 + +++ + bo(t)(Bm(t))” = 0. 


If (to, Yo) is a rational point on this curve and to € Z, then yo € Z. In what 
follows we confine ourselves to the study of integer points on the curve. 
Thus, we may assume that b,,(t) = 1, ie., the curve is given by the equa- 
tion 
y™ + bm—1(t)y”* +++++ bo(t) =0, where 6,(t) € Z(t). 


Let us show that in a vicinity of the point t = oo the algebraic function y(t) 
can be expanded as 


y(t) =a( Vt)" +---+b4+e(Vi)y 1 +--, 


where ¥/¢ is one of the branches of the k-th root of t (to be definite, we select 
the branch for which ¥/t > 0 for t > 0). The map (y,t) + t determines 
a ramified covering M? — CP! where M? is the Riemann surface of the 
algebraic function y(t). We are interested in the branches of this covering over 
co. Take one of the branches and consider its intersection with the preimage 
of a neighborhood of oo. The restriction of the ramified covering onto this set 
is of the form z+ z*. This means that y(z) is single-valued and z* = t. It is 
also clear that z = oo is not an essentially singular point of y(z). 
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We are interested in the case where there exists an infinite increasing 
sequence of positive integers ¢; for which y(t;) is a real (and moreover integer) 
number. Let us show that in this case all the coefficients of the expansion y(t) 
are real. Suppose that these coefficients are not all real. Let €t*/* be the term 
of highest degree s/k with non-real €. Then, for real values of t, the terms of 
higher degree do not affect the imaginary part of the sum of the series, and 
as t — +00 the terms of lesser degree are small as compared with ét*/* and 
cannot cancel its imaginary part. 

It remains to perform the last step — proving that the numbers t; € N for 
which y(t;) € Z constitute a set of zero density. This easily follows from the 
next statement. 


Theorem 2.4.2. Let 
p(t) =a(VA)" +--+ b+ (Vt) +, 


where t is real and the series is real and converges for t > R. Suppose that 
p(t) is not a polynomial. Then there exist constants C > 0 and € (0,1) such 
that the number of positive integers t < N for which y(t) € Z does not exceed 
CNE. 


Proof. Observe first of all that in the expansion of the m-th derivative of 
p(t) there are no terms of the form t’, where v > : —m. Therefore we can 


select an integer m > 1 such that vy”) (t) ~ ct-# as t > oo, where p > 0 and 
c #0 (the latter property is ensured by the fact that y is not a polynomial). 


Lemma. There exist positive constants cy and a such that, if T is suffi- 
ciently large, then the interval [T, T+c,T] contains no more than m positive 
integers t for which y(t) € Z. 


Proof. Let ty <--+ < tm41. Consider Lagrange’s interpolation polynomial 


m+1 
b= bi) ae (ERK Fate pho (boy, 
f(t) = y ee ee, 
—- (t; —t1)-...+ (ti) — ti-1) (ti — tiga) +--+ (ti — tm 41) 
The function y — f vanishes at t),...,¢m+41, and hence by Rolle’s theorem 


there exists a point € € [t1,tm4i] such that 


pO) (E) = fOM(E) = 


ml p(ti) 


eo (t; = ty) ge? (t; = ti_1) (E¢ = ti41) rs ae (t; = tm+i) 


Hence, y"”) (€) is a rational number whose denominator does not exceed 


I G-a<Gaawer*. 


1l<i<j<m+l1 
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On the other hand, if t; is sufficiently large, then 0 < |p (€)| < cat" 
Set AT = tm41 _ ty. Then | fo (8) = Ap eee and so cat; > 


2 =, mim 
AT-™m+1/2 ie. AT > c,T%, where a = — tT — anda = C OE 
m(m + 1) 


Returning to the proof of the theorem, we select ¢ so that 1 — ae = ¢, ie., 


1 
aac are, Then 0 < € < 1. Let us split the interval [1, N] into subintervals 
a 


(1, N*] and [N*, N]. By the Lemma any segment of length c,(V*)° that lies 
in [N®, N] contains no more than m positive integers ¢ for which y(t) € Z. 


aia hence the total number of such positive integers in [1, N] does not exceed 
N— N* 
NE +n—— < NEL Mm yyi-ae =— NEL UE ae 
Cy Ne C1 C1 


Therefore we can set C= 1+ ue 
C1 


Let B(N) be the total number of positive integers t < N for which y(t) € 
€ 


. C : C . : : 
Z. Then Nim. ae Nim, ye 0. This means in particular that there 


exist infinitely many positive integers t for which y(t) ¢ Z. 


2.5 Algorithms for factorization into irreducible factors 


2.5.1 Berlekamp’s algorithm 


The most effective algorithms for factorizing a polynomial with integer coef- 
ficients into irreducible factors uses the factorization of this polynomial over 
fields F,, for a prime p. Therefore we first discuss one of the algorithms for 
factorizing polynomials modulo p suggested by Berlekamp [Be4]. 

Let f be a polynomial with coefficients from F,. We may assume that 
the polynomial is monic. Before we apply Berlekamp’s algorithm we have 
to get rid of oe irreducible factors of f. This is done as follows. Let 
f=fi-:. , where f1,..., fx are distinct irreducible monic polynomials. 
It is easy 6 oe rity that 


d = ( Ge =e Tze nie ie , = lL &:- 


piri p\ni pins 


The polynomial f factorizes into the product of d and ig and in the fac- 


torization of f there are no multiple irreducible factors. If degd < deg f, 


then we can apply the same procedure to d. If 0 < degd = deg f, then 
d= {I f; = ?. Clearly, deg g < deg f and from factorization of g one can 
p\ni 
recover the factorization of d. 
Berlekamp’s algorithm is based on the following theorem. 
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Theorem 2.5.1. Let f € F,[z] be a monic polynomial of positive degree n. 
a) Ifh € F,[2] satisfies the relation h? = h (mod f), i.e., h?—h is divisible 


by f, then 
f(x) = [J (¢@), A(@) - a) 
acFp 
b) Let f = fi-...+ fe, where fi, ..., fe are different irreducible monic 


polynomials. In this case h satisfies the relation h? = h (mod f) if and only 
if h(x) = a; (mod fi), where a; € Fp. To each collection (a1,...,a%) there 
corresponds exactly one polynomial h whose degree is less than that of f. 


Proof. a) Set F(x) = J] (f(x), h(x) — a). Polynomials h(x) — a with 
acFp 
distinct a are relatively prime, and hence polynomials (f(x), h(x) — a) are 
relatively prime divisors of f(a). Hence f(a) is divisible by their product 
F(a). On the other hand, in F, the polynomial identity [[ (y—a)=y?—y 
ack, 
holds, and hence the polynomial 


I[ @@ - 2) = (A(z)? - 2@) 


ac€F, 


is divisible by f(x), and therefore F(x) is divisible by f(a). Thus, the poly- 
nomials f and F are divisible by each other and are monic. Hence F' = f. 
b) If h(v) = a; (mod f;), then (h(x))” = a? = a; = h(x) (mod f;), and 


therefore (h(a))” = h(x) (mod fi -...- fx). Conversely, if the polynomial 
(h(x))? — h(x) = J] (A(z) - a) 
acF, 
is divisible by f, then it is divisible by all the polynomials fi, ..., fx. It is 


also clear that if the irreducible polynomial f; divides the product of pairwise 
relatively prime factors h(x) — a, then it divides one of these factors, ie., 
h(a) = a; (mod f;). 

The existence and uniqueness of a polynomial h corresponding to a given 
collection (a1,...,@,) obviously follows from the next statement called the 
Chinese remainder theorem for polynomials. 


Lemma. Let fi, ..., fx be relatively prime irreducible polynomials over a 
field F, and let gi, ..., gx be arbitrary polynomials over the same field. Then 
there exists a polynomial h such that h(a) = g; (mod f;) and this polynomial 
is uniquely determined modulo f = fi-...+ fr. 


Proof. The polynomials f; and F; = td are relatively prime, and hence 


there exist polynomials a; and 6; such that " fi +0;F, = 1. Further, b;F; =1 
(mod f;) and b;F; =0 (mod f;) for j i. 
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Set h = So gibi Fj. Then h = gibi F; (mod f;) = g; (mod f;). The existence 
of the required polynomial h is thereby proved. 

The uniqueness of h follows from the fact that, if hy — g; and hz — g; are 
divisible by fi, then hi — hg is divisible by fi -...- fr = f. 


The relation 
(h(x))” =h(x) (mod f) 
is equivalent to a system of linear equations over F,,. Indeed, recall that deg h < 
deg f =n and let 


h(x) =to + tie +++ +tpiaz"?. 


Then 
h(x)? = h(x?) =to + te? +--+ tr—1eP"—), 


We find the residue of each monomial x?/, where j7 = 0,1,...,n —1, after 
division by f: 


n—1 
gi = ya (mod f). 
i=0 
As a result we obtain a system of linear equations: 
n-1 
Se =o: = 1, 0, n—-1. 
i=0 


The dimension of the space of solutions of this system is equal to k, the number 
of irreducible factors of f. Clearly, goo = 1 and gio = 0 for 7 > 0. Therefore 


the system has a trivial solution to = c, ty +++ = tyn_, = 0. This solution 
corresponds to the polynomial h of degree zero. 
Let hy = 1, ho, ..., hy be a basis in the space of solutions. If k = 1, then f 


is irreducible. If k > 1, let us find the greatest common divisors of polynomials 
f(x) and h(x) —a for all a € F,. As a result we obtain a collection of divisors 
1, ---, gs of f. Ifs <k, then, for each g;, we compute (gi, h3(x) — a), and 
so on, until we obtain all & divisors. 

It is easy to verify that at the end we necessarily obtain all the k& divisors. 
Indeed, let f; and f2 be distinct irreducible divisors of f. Consider a collection 
(a1, @2,...,@%), where a; # ag. Then there is a corresponding polynomial h 
for which 

h(w) =a, (mod f)) and h(x) = a2 (mod fo). 


Therefore, for a certain basic polynomial h;, we should have hj(x) = ay; 
(mod f:) and h(x) = ag; (mod fz), where a1; 4 ag;. Such a polynomial 
distinguishes factors f; and fo. 


Remark. The efficiency of Berlekamp’s algorithm can be essentially en- 
hanced by using the factorization algorithm due to Cantor and Zassen- 
haus [Ca2]. Namely, instead of the polynomials h;(a) — a = h,(x) — ahi (2), 
we take the polynomials 
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H(x) = ayhy(a) Soe abe aph,(x), 


where a1, ..., @, is a random collection of elements from F,, and then 
compute the greatest common divisors of f and H(~))/? — 1. If f is reducible 


4 
and p > 3, then with probability > 9 we immediately obtain a nontrivial 


factorization. 


2.5.2 Factorization with the help of Hensel’s lemma 


On page 49 we gave Kronecker’s algorithm for factorizing a polynomial with 
integer coefficients into irreducible factors, but this algorithm requires too 
much computation. Much more effective algorithms are known. The triple-L 
algorithm due to Lenstra—Lenstra—Lovasz which we discuss in the Appendix 
(see p. 279) is of the greatest theoretical interest. In practice, however, it often 
turns out to be slower than the factorization algorithm we describe now. 

Let f be a polynomial with integer coefficients. If cont(f) 4 1, then having 
divided f by cont(f) we get a polynomial with content 1. Let f = fy''-...-f;* 
be a factorization of f over Z into irreducible factors. Then 


i? are mary where gE Z{a}. 


Therefore, over Z, the greatest common divisor of f and f’ is equal to f7"'~'- 

. ae the same as over Q. Therefore, over Z, we can also divide f by 
(f, f’) and obtain a polynomial without multiple roots. So in what follows we 
will assume that cont(f) = 1 and polynomials f and f’ are relatively prime. 


Since (f, f’) = 1, there exist polynomials u, v € Qa] for which 
uf +vf' =1. 


Hence there exist polynomials U,0 € Z[a] for which Wf + Uf’ = n, where 
n €N. Ifa prime p is relatively prime to n, then, over Fp, the greatest common 
divisor of f and f’ is equal to 1. Let us consecutively calculate (f, f’) over 
F, for p = 2,3,5,... until we get (f, f’) = 1 and simultaneously the highest 
coefficient of f will be relatively prime to p. Fix this p in what follows. 

Modulo p, the polynomial f has no multiple irreducible factors. Hence we 
can apply Berlekamp’s algorithm and obtain a factorization f =afi-...- fr 
modulo p, where fi,...,f% € Zl[xz] are monic polynomials, a is the highest 
coefficient of f and deg f = deg fi +---+deg f,. Hensel’s lemma given below 
enables us to construct a factorization of f modulo p™ starting from the above 
factorization. 

It suffices to consider the following situation: 


f =fif2 (mod p™), where f, fi, fo € Z[z], deg f = deg f, + deg fo, 
fi is monic, the leading coefficient of f is relatively prime to p, and (*) 
the polynomials f; and f2 are relatively prime modulo p. 
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The last line implies that there exist polynomials u,v € Z[a]| for which uf, + 
ufo =1 (mod p). If u’,v’ are some other such polynomials, then uw’ = u+wfe2 
(mod p) and v' = v—wf1 (mod p), where w € Za]. Therefore the conditions 
deg u < deg fo, degu < deg f; uniquely determine u and v modulo p. 

Hensel’s continuation of the factorization f = fif2 (mod p™) is a fac- 
torization f = ff. (mod p™++), where the polynomials f, and fy, sat- 
isfy the same conditions as f; and fo, and where f; = f; (mod p™) and 
deg f; = deg fi. 


Lemma. [fm > 1, then, for any factorization f = fifo (mod p™) sat- 
isfying the above condition (*), there exists Hensel’s extension f = f fo 
(mod p™+!) for which the polynomials f, and fy are uniquely determined 
modulo p™*?, 


Proof. We are looking for polynomials f,;, f5 € Z[x] such that 
ie =fitpn, deg f; = deg fi for i= 1,2, 
f, is monic and the congruence f = f,f. (mod p™**) holds, ice., 


fifo +0" (g2f1 + f2) +P?" 9192 = f (mod p™*"). 


ey 


Clearly, p?"gig2 = 0 (mod p , and so we come to the congruence 


gfitofe=d (mod p), (1) 


where d = p-"(f — fif2) € Zia]. The solutions of (1) can be obtained in 
terms of polynomials u and v for which uf; + uf2 = 1 (mod p). Namely, 


g. =dv+wf;, (mod p) and go =du—wf2 (mod p), 


where w € Z[z] is an arbitrary polynomial. Since f, = fi + p™g1, where fi 
and f, are monic polynomials, it follows that deg g, < deg f;. Therefore, for 
a given v, the polynomial gi is uniquely determined modulo p. In this case 
the polynomial gz is also uniquely determined modulo p. Hence f, and fy, are 
uniquely determined modulo p™*!. 


Remark. The process of deriving the polynomial factorization modulo 
p’™, where m is large, can be essentially speeded up if one raises factorizations 
modulo gq to factorizations modulo gr, where r = (p,q). For details, see [Co3]. 


To obtain a factorization of f € Z[x] into irreducible factors, one can use 
the following procedure (we assume that cont(f) = 1 and f has no multiple 
roots). By means of Mignotte’s inequality (Theorem 4.2.6 on page 152) we 
get an estimate M for the coefficients of the divisors of f whose degree does 
not exceed + deg f. Next, we select m such that p™ > 2aM, where a > 0 is 
the leading coefficient of f. Then, using Berlekamp’s algorithm and Hensel’s 


2.6 Problems to Chapter 2 73 


lemma, we factorize: f =a-fi-...-f, (mod p™), where fi,..., fx € Z[x] are 
monic polynomials. 

Let g(x) = aa! +--+» € Z[z] be a divisor of f. Then az = a/a; €N, and, 
modulo p, the polynomial azg is of the form a- fi, -...- fi,. The condition 
p™ > 2aM shows that the polynomial agg is uniquely recovered from the 
polynomial a: f;-...- f, (mod p™). Indeed, the coefficients of agg lie strictly 

mp™ p™ 


between — and 5 and hence they are uniquely recovered from their 


residues after division by p™. 


2.6 Problems to Chapter 2 


2.1 Let f €Z[zx] be a polynomial with roots a1,...,@, and let M =max |a;|. 
u 


Prove that, if f(xo) is a prime for an integer xo such that |vo| > M +1, then 
f is irreducible. 


2.2 Let p be a prime, and a a positive integer not divisible by p. Prove that 
x? — x — ais irreducible. 


2.3 For a polynomial f € Z[z], let there be an integer n such that: 


1 
1) All the roots of f lie in the half-plane Rez < n— 5" 


2) f(n—1) £0. 
3) f(n) is a prime. 


Prove that f is irreducible. 


2.4 [Kl] a) Let f(x) = fra” +---+ fo © Za], where |fo| > 1. Further, 
let {c1,...,¢r} be the set of all divisors of |fo|. Let f assume prime values 
Pi,-++;Pn at n distinct integer points aj,...,@, such that |a;| > 2 and a; 
does not divide c; +1, where i = 1,...,n and 7 = 1,...,r. Prove that f is 
irreducible. 

b) Let ai,...,@n,7 and s be integers such that |a,| > 2. Let the numbers 
gq = (-1)"a1-...+dn +8 and py, = rag, +8, where k = 1,...,n, be prime with 
q+ 1 not divisible by az. Prove that the polynomial 


f(z) =(a@—-ay)-...-(@-an) +rx+s 
is irreducible. 
2.5 Let f(r) = 2° + ayz"-14 +--+ an_12 + pay be a polynomial with 
integer coefficients and p a prime. Prove that if p > 3) la,|"~+~*Ja;|, then f 


i=0 


is irreducible. 
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2.6 Let aj,...,@, be distinct integers. 
a) Prove that the polynomial (a—a,)(#—az)-...-(w—a,,) —1 is irreducible. 
b) Prove that the polynomial (#— a )(“%—ag)-...-(a—a@n)+1 is irreducible 
except for the following cases: 


(x —a)(c-—a—2)+1=(¢-—a-1)’; 
(x —a)(x —a—1)(« —a— 2)(@ —a—3) +1= ((e@-—a—1)(a@—a—2) t\, 
c) Prove that the polynomial (a — a,)?(a — ag)? -...- (a — an)? + 1 is 


irreducible. 


2.7 Prove that any polynomial with integer coefficients can be represented 
as the sum of two irreducible polynomials. 


2.8 a) Let f(x) be a polynomial with integer coefficients assuming the value 
+1 at more than three integer points. Prove that f(n) 4 —1 for any n € Z. 
b) Let a,b € Z and let the polynomial ax? +bzx-+1 be irreducible. Let n > 7 
and let a1,...,@» be distinct integers; set p(x) = (a—a1)(w—azg)-...:(w@—ay). 
Prove that the polynomial a(y(x))” + by(x) + 1 is irreducible. 


2.9 Let F(a1,...,%n) € Zlai,..., an]; set f(x) = F(a,...,x). Prove that if 
f is irreducible, then F is also irreducible. 


2.10 Let p > 3 be a prime and let n < 2p. Prove that the polynomial 
x? + px” — 1 is irreducible. 


2.11 Let p > 3 be a prime, aj +--+ + ap = 2p and n < 2p. Prove that the 
polynomial 


aq... oP +att+...+ 07-1 
is irreducible. 


2.12 Let f be an irreducible polynomial with integer coefficients, let D be its 
discriminant and p a prime. Suppose that modulo p the polynomial f factors 
into k irreducible factors. Prove that D(@~!)/? = (—1)"-* (mod p). 


2.7 Solutions of selected problems 


2.1. Let f(x) = g(x)h(x), where g,h € Z[x] and deg g > 1, degh > 1. Since 
f(xo) = p is a prime, we may assume that g(a) = +1 and h(ap) = +p. On 
the other hand, the roots (3),..., 3, of g are also roots of f, so that |3;| < M, 


and therefore 
|9(a0)| = laol [] eo — il; 


where |ao| > 1 and |xp—(;| > |xo|—|;| > (M+1)—M = 1. Hence |g(ao)| > 1: 
a contradiction. 


2.7 Solutions of selected problems 75 


2.2. Let x? —x—a be reducible over Z. Then it is reducible as a polynomial 
over Z,. Thus, over Zp, we have 


x? —%-—a=gq(x)h(x), where 1 < degg < p—1, 
and the polynomial g is irreducible. If b € Z,, then 


g(a — b)h(a — b) = (a — b)? — (a@— 6) -a=a2? —a4-a. 


Thus, the polynomial 7? — a —a is divisible by p polynomials g;(a) = g(a —‘), 
where i = 0,1,...,p —1. As degg < p—1, these polynomials are distinct 
because 

(a —i)" —(2— 9)" =(GG—i)ko* 1+... . 


Therefore p = deg(x? — x — a) > pdegg. It follows that degg = 1. But if a 
is not divisible by p, the polynomial x? — x — a has no roots in Z, because 
b? —b=0 for any b € Zp. 

2.3. Let f(x) = g(x)h(a), where g,h € Zia] and degg > 1, degh > 1. 
Since f(n) = p is prime, we may assume that g(n) = +1 and h(n) = +p. 
On the other hand, if g(G;) = 0, then f(G;) = 0, and so Re; < n— 4, ie., 
Re (n - 4 - pi) > 0. This means that 


2 


1 
n- 5-6-1 < 


1 
n-5 B+ for t > 0, 


and so la(n - 4 — t)| < la(n - 4 + i)|. The condition f(n — 1) 4 0 implies 
that |g(n - 1)| > 1. Therefore 


ja(m)| = Jan- 5 +5)| > |am- 5 - 5)| =lon—vl 21. 


This is a contradiction. 

2.4. a) Let f = fife, where fi, fo € Z[a]. Let f1(0) = by and fo(0) = ca. 
For definiteness sake, let us assume that |b1| < |ci|. 

Case 1: |bi| = 1. In this case |ci| = | fo| > 1. The conditions f(a,) = px, 
where px is a prime, implies that either 


filax) = +p, and fo(a,) = +1 (1) 


or 


fi(ar) = +1 and fo(ar) = Dk. (2) 


First, suppose that fo(a,) = +1. Then fo(ax) — fo(0) = —(c1 £1). 
On the other hand, f2(a,)— f2(0) is divisible by a;,,. This is a contradiction. 
It remains to assume that fi(a,) = 41 = +b; = +f1(0). If fi(ax) = 
—fi(0), then fi(ax)—fi(0) = 2f1 (ax) = +2. On the other hand, fi (ax) — fi(0) 
is divisible by az, and so |a,| < 2 which contradicts our assumption. 
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Thus fi(az) = fi(ao) = 6; for k = 1,...,n. Since deg f, <n, we deduce 
that fi(%) = b; for all x. 

Case 2: |bi| > 1. In this case 6; and c; are on equal footing since 
lci| > |bi| > 1. Let, for definiteness sake, fi (ax) = +p, and fo(ax) = + 
Then f2(ax) — f2(0) = —(c1 + 1) is divisible by a;, which contradicts to the 
assumption. 

b) Obviously follows from a). 

2.5. As in the proof of Theorem 2.2.7, we deduce that the product of 
absolute values of the roots of one of the polynomials g and h does not exceed 
|a,,|. To obtain a contradiction, it suffices to show that for any root a of f we 
have |a| > |a,|. Suppose that f(a) = 0 and Ja] < |a,|. Then 


Re 


n-1 
|pan| = lo” + aya"? +--+ + @n—10| < |an| oe Jan"? Jai] < plan|, 
i=0 


which is impossible. 


3 


Polynomials of a Particular Form 


3.1 Symmetric polynomials 


3.1.1 Examples of symmetric polynomials 


A polynomial f(#1,...,@,) is called symmetric if, for any permutation o € S),, 
we have 
PBati: sey Pcie) _ f Ri, tee silo) 
The main examples of symmetric polynomials are the elementary symmet- 
ric polynomials 


Cb Cigisay a= S Tips Vin, 
ty <i <tk 


where 1 < k < n. It is convenient to set o9 = 1 and o,(a#1,...,%n) = 0 for 
k>n. 

One can determine elementary symmetric polynomials with the help of 
the generating function 


n 


a(t) = s o,t* = [[a + tx;). 
k=0 


i=1 


If 71, ..., @p are the roots of the polynomial x” + aja"~! +---+ ay, 
then 
OK(21,---,Ln) = (—1)* ag. 


Another example of symmetric polynomials is given by the complete ho- 
mogeneous symmetric polynomials 


pe(21,---;2n) = 3 | ee 
Their generating function is of the form 
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An important example of symmetric polynomials is given by the sums of 
powers 


$4 (@1,---,;2n) =ogP4...ta%, 
Their generating function is of the form 
oo ii - 
u— L 
a) = te "Deas 


Sometimes one uses monomial symmetric polynomials: 
= pil in 
Miq..in (x1, tee , Ln) = : ; vo(1) uur Vo(n)* 


The generating functions o(t) and p(t) are related by the equation 
o(t)p(—t)= 1. 


Equating the coefficients of t”, where n > 1, on the two sides here, we obtain 


n 


So (-1)"orPn—r =0. (3.1) 


r=0 


The generating function s(t) is expressed in terms of p(t) and o(t) as 
follows: 


Equating the coefficients of t’*! on the two sides we obtain 


NDn = », SrPn—r; (3.2) 
r=1 


n 


NOn = ely ea (3.3) 


r=1 


Relations (3.3) are called Newton’s formulas. 
Let us make more explicit the relations (3.1) forn=1, ..., k. With o, 
.., Ok regarded as fixed, these relations can be considered as a system of 
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linear equations for pi}, ..., px and similarly, with pi, ..., px fixed, asa 
system of linear equations for 01, ..., o,%. Solving these systems we find that 
pi sol 0 . 0 C1 1 0 0 
p2 pi il . 0 02 OY 1 0 
w=] i i foci], peep ob 3 
Pk—1 Pk—2 Pr-3-+» 1 Ok-1 Ok-2 Tr-3»-. 1 
Pk Pk-1 Pk-2--+ Pl Ok Ok-1 Ok-2.--- O1 


Similarly, from (3.2) we obtain 


1 0 0 Ss; —-l 0 0 
PL ~~ 52 S1 —2 0 
k-1 2p2 Pi 1 nas i) 1 ; 
Sk = \—- ) . 3 ‘ : e fo P= kl : 3 5 : 
ae , : a Sk—1 Sk—2 Sp_3... —k+1 
Pk Pk-1 Pk—-2 +++ Pi ee ee 54 
From (3.3) we deduce that 
8 1 0 0 
o. 1 0... 0 < ~ = F 
202 0, 1 ... 0 nis 
ESTE oP Og ay Pe 
kor Oh-1 Oh-2 o1 Sk—1 Sk—2 Sk-3... k-1 
~ ~S Sk Sk-1 Sk-2++. S81 


3.1.2 Main theorem on symmetric polynomials 


The elementary symmetric polynomials are algebraically independent and 
form a basis of the ring of symmetric polynomials. A more precise formu- 
lation of this statement is as follows. 


Theorem 3.1.1. Let f(a1,...,2%n) be a symmetric polynomial. Then there 
exists a polynomial g(yi,---,Yn) such that f(a1,...,2n) = g(01,-.--,0n). This 
polynomial g is unique. 


Proof. It suffices to consider the case where f is a homogeneous polyno- 


mial (a form). We will say that the order of a monomial x} -----a” is greater 
than that of ai" -...- ate if 

Ai = 1, eats Ak = bk and Akt > pbke+1- 
(Here k = 0 is possible.) Let ax} -...- a2” be the highest order monomial of 


f. Then A; > +--+ > Ay. Consider the symmetric polynomial 


M1—A2 , pAZ—A Dr 
fi = f —aopt*? - 09?“ +... on. (1) 
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The highest order term of the monomial o}!~*? ----- 0” is equal to 
wp? (ary arg) 22-98 + (at... ty)" = att 9? 2... on, 


Hence the order of the highest order monomial of f; is strictly lower than that 
of the highest order monomial of f. 

Let us apply the operation (1) to f1, and so on. Clearly, after finitely many 
such operations we obtain the zero polynomial. 

Let us prove now the uniqueness of the representation f(a@1,...,0%n) = 
g(o1,.--,On)- It suffices to verify that if 


Giese se) = Yee er Oe 
is a nonzero polynomial, then after the substitution 
Ya =O, =X te:++ hn, ---)) Yn =On =%1°°°°° Ln 


this polynomial remains nonzero. Let us confine ourselves to the highest order 
monomials of the form 

tatetin wiatetin in 
xy Ly en aire 


Qi... n 


an 


obtained after the substitution. It is clear that the highest among these mono- 
mials cannot cancel with any other monomial. 


It is obvious from the proof of Theorem 3.1.1 that, if f(a,...,¢n) 
is a symmetric polynomial with integer coefficients, then f(#1,...,@n) = 
g(o1,-.-;0n), where the coefficients of g are also integers. The determinant 
formula for o; in terms of pj, ..., pe indicates that for complete homogeneous 
polynomials an analogous statement also holds. 

As to sums of powers, an expression of the form f(x1,...,%n) = g(S1,---, $n) 
also exists but the coefficients of g are not integers now. For example, 


(a1 + 2)? — (a7 +23) _ s?— 82 


ee 

The main theorem on symmetric polynomials implies that, if 71, ..., @p 

are the roots of the polynomial 
fla) =a" + aya") +--+ an, 
then the quantity 
D=|]@-2;), 
i<j 

which represents a symmetric polynomial in x1,...,%», can be polynomially 


expressed in terms of a1,...,@,. This quantity is called the discriminant of f. 
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A polynomial f(21,...,@n) is called skew-symmetric if 


i.e., under transposition of any two of its indeterminates x; and x; it changes 

its sign. The polynomial A = [| (2; — z;) is an example of a skew-symmetric 
i<j 

polynomial. Clearly, A? = D. 

Theorem 3.1.2. Any skew-symmetric polynomial f(a@1,...,@n) can be repre- 


sented in the form 
A(a1,.--,2n)g(@1,.--,;2n), 


where g is a symmetric polynomial. 


Proof. It suffices to verify that f is divisible by A. Indeed, if f is a 
polynomial, then, for obvious reasons, this polynomial is symmetric. Let us 
show, for example, that f is divisible by 71 — x2. We make the change of 
variables 71] =u+v, v2 =u-—v. As a result we obtain 


F(@1,22; 23, eae , En) — fila, v, £3, es 2m). 
If x; = x2, then u = 0 and so f,(0,v,23,..-,2n) = 0. This means that f; 


is divisible by u, i.e., f is divisible by 2; — xg. We similarly prove that f is 
divisible by x; — x; for all i < j. 


The equality A? = D shows that the representation f = Ag is not unique. 


3.1.3 Muirhead’s inequalities 


Let A = (A1,.--,An) be a partition, i.e., an ordered set of non-negative integers 
Ay > Ag > +++ > An > O. Set [A] = Ary +--+ + An. We say that A > yp if 
Apter FAR > ei +--+: + pe for k= 1,2,...,0. 

To every partition \ one can assign a homogeneous symmetric polynomial 


i Ne o(n 
My(e1,.--;%n)=— D0 Oe on™. (1) 


Clearly, deg My = |Al. 
Example 1. If X= (1,...,1), then M)(a1,...,@n) =@1-+...+2n. 
Indeed, the sum (1) in this case consists of n! summands 2 -...+ Xp. 


Example 2. If X= (n,0,...,0), then M)(a1,...,¢n) = £(a? +...+ 2%). 
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Indeed, the sum (1) consists in this case of (n — 1)! summands 27, (n — 1)! 
summands #3, and so on. 

For positive 21,...,2%n, the inequality 

m a a nm 
ie ae > Ly... Ly 
n 

holds (this is the well-known inequality between the arithmetic and geometric 
means). The following statement is a generalization of this inequality. 


Theorem 3.1.3 (Muirhead, [Mul]). The inequality 
My(a) > M,(c) (2) 


holds for all vectors « = (#1,...,%n) with positive coordinates 11,...,Un if 
and only if |A| = |u| and \ > pw. The equality is only attained if X = pw and 
Ly = = Ly. 


Proof. Suppose first that (2) holds for all > 0. Let a) =--- =a, =a 
and @p41 =-+: = 2p =1. Then 
: Ait +AK 
i ie, 
a—oo M,,(2) a—oo qQghitr THk 


Therefore Ay +--+ +AxK > Wi +++ + Me. 


Now take k =n and 21 =-:-=2, =a. Then 
Maca) gherres 
M,,(2) — Quire TH : 


For a > 1, we deduce, as earlier, that |A| > |u| whereas, for 0 < a < 1, we 
obtain |A| < |x]. 

The proof of the statement in the opposite direction is more complicated. 
It makes use of the following transformation R,;. Let uw; > 1; > 0, where 
i<j. Set Riju =p’, where wi = wi +1, wi = uj —1 and pi, = pe for k 4 i,j. 
It is easy to verify that py’ > yw and |y’| = |pI. 


Lemma 1. If \ = Rijp, then M)(x) > M,,(x) and the equality is only 
obtained if x1 = +--+: = Xn (we assume that the numbers x1,...,X2n are posi- 
tive.) 


Proof. For every pair of indices p and q such that 1 < p < q < n, the 
difference M)(x) — M,,(x) contains a summand of the form 
A; (arate + aay =," = nee), (3) 
where A is a positive number. 
To make the presentation more readable, we write x) = a, %q = b, Jui = a, 
ju; = GB. Recall that \; = a+1, A; = G—1 and a> G. Expression (3) divided 
by A is equal to 
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gertpe— + gh lpett — gop? = gf b* = 


(ab)? “(a Blatt? = FF) SG, 


where the equality is only possible when a = b. Thus M)(x) — M, (x) > 0 
and, if among the numbers 71, ..., X, there are at least two distinct ones, 
then the inequality is strict. 


Lemma 2. If \ > uw and |A| = |u| but AX F¥ pw, then A can be obtained 
from yu after a finite number of transformations Ri;. 


Proof. Let i be the least index for which A; # j4;. Then the condition 
\ > implies that A; > pi. The equality |A| = |u| means that )°(A,— px) = 0, 
and so Aj < jt; for some j. Clearly, 7 < j and ju; > 0. Hence we can apply R;; 
to pw. As a result, we obtain a sequence v in which % = pu, +1, vj = py — 1, 
and Vp = [tp for k # i,j. Taking into account that A; > ui and A; < fj we 
obtain 


[Ai — wi] = [Ae — a] +1, [Ag — wy] = [Ag — 43| +1. 


Soe — vel = 90 An — wel — 2; 


ie., using R;; we have diminished )>|A, — 4z| by 2. Therefore, using a certain 
number of transformations R;;, we can reduce )> |Ax — jug| to zero. 


Therefore 


Lemmas 1 and 2 obviously imply Muirhead’s inequality. 


3.1.4 The Schur functions 


Consider the infinite matrix 


Po Pi P2--- 
0 po pi --- 
a ae eee 
where p; = pi(a1,...,2n), is the complete homogeneous polynomial of degree 


n. The (i,7)th element of the matrix P is equal to p;_;. The Schur function 
or S-function corresponding to a partition \ is the minor of P formed by the 
first n rows 0,1, ..., n—1landcolumns \,, A,_-1 +1, ..., Ar +n-—1. This 
symmetric function in 71, ..., ®p is denoted by sy. The function s, can be 
expressed as a determinant as follows: 


8 = |Pryr41—j+5-al1 = |Pajti—gl- 


Considering transposed matrix we get s, = |py,4;—-il7- 
The skew Schur function s,,, corresponding to a pair of partitions \ and 
jis the minor of the matrix P formed by the rows with numbers py, fn—1 +1, 
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..; fy+n—1 and the columns with numbers A,, An—i +1, ..., AL, +n—-1. 
Clearly, s, = sy,9. A partition yz is called a subpartition of A if Ay > py for 
i=1,...,n. One can prove that, if j is not a subpartition of A, then s),,, = 0. 

The Schur functions were introduced by Jacobi long before Schur as quo- 
tients of skew-symmetric functions of a certain type. Let a = (aj,...,@n) be 
a partition and aq the anti-symmetrization of the monomial aft -...- «2, 
i.e., 

Qa = >> (-1)*w(2*), 
wESn 
where (—1)” is the sign of the permutation w and w(a*) = ve) hen etn): It 
is easy to verify that the polynomial aq(#1,...,2n) is equal to the determinant 
|S |; in particular, this polynomial is skew-symmetric. Hence, if aj = ai41 
for some i, then a, = 0. Thus, we may assume that a = \+ 6, where 


6=(n—1,n-2,...,1,0). 


Theorem 3.1.4 (Jacobi-Trudi identity). Let 6 = (n-—1,...,1,0). Then 


Ajtn—j 
= QN+65 bes Wintel = |x; i 7\0 
— a bllate | a —= = — 
as es ee 
Proof. Let a = (a1,...,@n) be a partition. Consider the matrices 
Ag=||x}" ie Ag =||Po;—n4gll? and M = || (—1)” *on_a(&;)| 2 
where @; = (%1,...,%j-1,2j41,---,2n). Let us show that these three matrices 
are related by the formula 
HM = Ag. (1) 
Let 
n—-1 
oD (t) = Son (@,)t* = [ [+ ait) 
k=0 fj 
and = 7 
p(t) = So pxt® = TJ] a at) 
k=0 1=1 
Then 


p(t)o (—t) = (1 — ayt)7*. 
By comparing the coefficients of t®’ on both sides of this equality we deduce 
that 


nm 
x Pai—n+l (—1)"on_i(B;) — ae : 
l=1 


This is precisely the relation (1) required. 
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Relation (1) implies, in particular, that 
det H, det M = det Ag. (2) 


To compute det WM, we set a = 6 = (n—1,...,1,0). In this case the matrix 
H,, is of the form ||p;—;||?. This matrix is triangular with elements pp = 1 
on the main diagonal. Hence det Hs = 1, and therefore det M = det As = as. 


But since det Hy = sq—s, it follows that for a = +6 equation (2) takes the 


: QA\+6 
form s,a5 = a)+6, 1.€., 8, = . 
as 


3.2 Integer-valued polynomials 


3.2.1 A basis in the space of integer-valued polynomials 


The polynomial p() is called integer-valued if it assumes only integer values 
for all integers x. 
By induction on k& one can prove that the polynomial 


(*) a-(e—1)-...-(e—k+1) 


k} kl 
is integer-valued. Indeed, for k = 1, this is obvious. Suppose that ({) is an 
integer-valued polynomial. It is easy to verify that 


x+1 x _ (& 
k+1 k+1)) \ky' 
Hence, for all integers m and n, the difference (oy) ea) is an integer. It 


remains to observe that (cca) = 0 for any k. 


In a sense, the integer-valued polynomials are exhausted by the polynomi- 
als (ye Moreover, the requirement that p(n) € Z for all n € Z for p(x) to be 
integer-valued can be considerably weakened, as we now prove. 


Theorem 3.2.1. Let px, be a polynomial of degree k assuming integer values 


atzt=n,n+1, ..., n+k for an integer n. Then 
(x) =co(* ) + ame: A eat 
BES RN ep PEN ea) Bg oe 
where Co,C1,---,Ck are integers. 


Proof. The polynomials 


()=" Qos Qe Qe 
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form a basis in the space of polynomials of degree not greater than k, and 


hence 
ee a 
Dr(@) =«(7) +a(,7,] +++++ Ck, 


where co,Ci,...,C€x are some numbers. It only remains to prove that these 
numbers are integers. 

We use induction on k. For k = 0, the polynomial po(a) = co assumes an 
integer value at x = n for any n, and so co is an integer. Suppose now the 
required statement is true for all polynomials of degree not greater than k. 
Let the polynomial 


xv 
PRti(L) = Co i a a a ee 


take integer values at zr =n,n+1, ..., n+k+1. Then the polynomial 


£ x 
Apr+1(X) = pri(t + 1) — pe+i(a) = (i) + C1 (2 1) Se i eo) 


takes integer values at r= n,n+1, ..., n+k. Therefore co,ci,...,cK are 
integers, and hence so is 


Ck41 = Peoiln) -o(, 74) 71 (1) == a(T), 


Theorem 3.2.2. Let R(x) be a rational function which takes integer values 
at all integer x. Then R(x) is an integer-valued polynomial. 


f(x) 


Proof. We write R(a) = aay where f and g are polynomials. Having 
g(x 


divided f by g with a residue, we see that 


R(x) = pe(x) + r(x), 


where pz, is a polynomial of degree & and r(a) — 0 as 2 — oo. Therefore, for 
large values of n, the values of pz(n) differ but slightly from integers. Let us 
show that p(x) is an integer-valued polynomial. This is done almost by the 
same arguments as used for the proof of Theorem 3.2.1. 

Let us express p;(x) in the form 


x 


Pr(x) = (i) Sessa Gye 


For k = 0, the number cp is arbitrarily close to an integer, and hence co € Z. 
The polynomial 


x 


Ae(e) = pele +1) ~ pel2) =aa(, 
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also assumes almost integer values at large integer x and its degree is equal 
to k — 1. Applying the induction hypothesis to it, we see that co, C1,-.-,Ck—1 
are integers. It is also clear that the number 


a niy-a(f) =a) 


is also an integer. It remains to prove that r(x) = 0. As we already know 
r(n) € Z for n € Z and r(n) — 0 as n > ov. Therefore r(n) = 0 for all 
sufficiently large integer n. But any rational function with infinitely many 
zeros is identically zero. 


Corollary. Let f(x) and g(x) be polynomials with integer coefficients and 
let f(n) be divisible by g(n) at all integer n. Then 


(2) = ( cx (‘)) 9(2 


where Co,...,Cm are integers. 


Polya [Pol] has shown that if an entire analytic function f(z) assumes 
integer values at all integer or positive integer values of z and grows not 
too quickly, then f(z) is an integer-valued polynomial. More precisely, the 
following statements hold: 

1) If f(N) C N and |f(z)| < Ce*l#!, where k < In2, then f is an integer- 
valued polynomial (for a proof of this statement, see [Ge2]). 

2) If f(Z) C Z and |f(z)| < Ce*ll, where k < In (254), then f is an 
integer-valued polynomial. 


The examples of functions 2* and Fz ( (244) - (34) show that 


both estimates are the best possible. 


3.2.2 Integer-valued polynomials in several variables 


The structure of the basis for the space of integer-valued polynomials in n 
variables is similar to the one-variable case. 


Theorem 3.2.3 ({Os2]). The polynomial pa,...d,(U1,---,%n) of degree dj 
with respect to x; assumes integer values at x, = aj, a4, +1, ..., a, + dh, 
165 In =Gn, Intl, ..., Gn + dy if and only if 


LY In 
Dax. de (Bigs <5) = Ss ke a) nae ey! 


where Cr,...k, are integers. In particular, such a polynomial assumes integer 
values at all integer points (x1,...,%n). 
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Proof. Let us consider the case n = 2 since the general case is similar. For 
a fixed x1 € {a1,...,a1+d1}, the polynomial py, a,(x1, 2) takes integer values 
at © = a2,...,d2 + dg. Therefore, by Theorem 3.2.1 for 7, = a1,...,a, +d), 
we have the identity 


pasa(a1.22) = Y° on(er)(%2), (1) 


k2=0 


where Cx (@1),-.-, Cko(@1+d1) are integers. If we now consider (1) as a relation 
for polynomials in variables x; and x2, then it is clear that c,, (a1) is a uniquely 
determined polynomial. As we have already shown, this polynomial (of degree 
not greater than d;) assumes integer values at x1 = a@1,...,a, + di, as was 


required. 


3.2.3 The q-analogue of integer-valued polynomials 


Gauss’s binomial coefficient, or the q-binomial coefficient, is 


klq (eF—D(gF!-1)-....q-D 


In the limit as g — 1 Gauss’s binomial coefficient becomes the usual bino- 
mial coefficient Cie The Gauss binomial coefficient is one of the numerous 
g-analogues of elementary and special functions, see, e.g., [Ki] and [Ga3]. 


A q-analogue of the identity ("{") = (%) + (,,",) is 


Pe Gt tale 0) 


To prove (1), it suffices to observe that after simplification this identity be- 


comes 
qvtt =] 1 qv kt 
(qk = 1)(qr—*+1 = 1) — qk ai ae qr—ktl Se 
In what follows we will assume that g,n and k are integers such that q > 2 
and 1 < k <n. In this case induction on n based on formula (1) shows that 


[ela is an integer. 


Now consider polynomials fo, f1, f2,..., where 
“ite t= Die) ) 
pHlend fa) tig he 0 
v7 @-N@-)-..- 1) - 
It is easy to verify that 
firl@™)=0 for n=0,1, ..:,k-1 and fx(q")=1. (3) 


Moreover, f,(q") = [ela for n > k. In particular, for any positive integer n, 
the number f;(q") is an integer. 
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Theorem 3.2.4. The polynomial p,(a) of degree k assumes integer values at 
x=1,q,q7,...,q" if and only if 


py(@) = cy fae) + Cea fe i(@) +o*? + eafilz) + em, (4) 


where Co, C1,---,Ck are integers and the f; are defined by formula (2). In par- 
ticular, such a polynomial assumes integer values at all « = q” (n € N). 


Proof. The polynomials fo, fi,...,f% form a basis of the linear space 
of the polynomials of degree not greater than k, and so (4) holds for some 
Co,C1,---,Ck € C. It remains to verify that co,...,c, € Z. Formulas (3) show 
that 

Pr(1) = ©o, 
Pr(q) =¢1 + ©, 
pr(q?) = c2 ter fig’) + co, 


pe(q*) = ck + chi fe—1(g*) +--+ + aa fila*) + co. 


Therefore we obtain recursively 


oo €ZSq€42>::- Sc €Z. 


Lastly, observe that if an entire analytic function assumes integer values 
at points 1,q,q?,... and grows not too quickly, then this function is a poly- 
nomial. For a precise formulation and proof of this statement, see [Ge2]. 


3.3 The cyclotomic polynomials 


3.3.1 Main properties of the cyclotomic polynomials 


The polynomial 
p(x) =] ](@ - ex), 


where €1,..-,€y(n) are the primitive n-th roots of unity, is called the cyclo- 
tomic polynomial of degree n. For example, 


G(r) =2-1, O(2)=24+1, O3(2)=2? +2741, G4(2) = 2? +1. 


If n > 2, then +1 are not primitive roots of unity of degree n. In this case 
the primitive roots split into pairs of complex conjugate numbers. Therefore, 
for n > 2, the degree of @,, is even. 

We deduce directly from the definition of the cyclotomic polynomial that 


[[ a(x) =a2"—1. 
d|n 
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Theorem 3.3.1. Let n > 1 be odd. Then 
P2,(x) = &,(—2). 
Proof. If €1,..-,€n) are all the primitive roots of degree n, then 


FE iy eas —Evy(n) 


are all the primitive roots of degree 2n. Indeed, for n odd, we have y(2n) = 
y(n). Therefore the number of primitive n-th roots of unity is equal to the 
number of primitive 2n-th roots of unity. It remains to prove that if € is a 
primitive n-th root of unity, then —e is a primitive 2n-th root of unity. If 
0<k <n, then e* 4 —1. Indeed, if «* = —1, then e** = 1 and e2"-* = 1, 
but either 2k <n or 2n — 2k < n. Thus, if 0 < k <n, then (—e)* 4 1 and 
(-eynt® = = (-e)h AL. 
Therefore 
Ot) = (=e = 8) ) 2.0 (= 8 =n); 


®o,( 0) = (2 +€1) + o..7 (BP egg). 


It remains to recall that the degree of @,, is even. 


3.3.2 The M6bius inversion formula 


The relation 


[[2a(2) =2" -1 
d| 


enables us to express ®,,(x) in terms of «7 — 1, where d runs over the set of 
divisors of n. To this end, we have a rather general construction based on the 
Mobius function 


1 ifn=1,; 
p(n) = 4 (1) ifn = pr - +e; 
0 ifn = p2m, 
where p, Pi, ..., Dk are primes. 


Theorem 3.3.2 (M6bius). Jf F(n) = >> f(d), then 
d|n 


Proof. Let us verify first that 5> u(d) = 0 for alln > 1. Let n= pj’ -...- 
d|n 
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Clearly, 
So F(a)uld) = So f(diuldt)= So fd) SS ul) 
ab=n didgb=n dy|n d2gb=n/d, 
Let m= as Then 
dy 
1 ifn=d; 
u(b) = D_ wld) = 
2 » 0 ifnAd. 
Hence 


Soi fa) SS ud)) =f(). 


di|n dab=n/dy 


Corollary. If F(n) = [I f(d), then 


d|n 
rm) =[] (5) =F. 
d|n d|n 


For cyclotomic polynomials, the Mobius inversion formula yields 


Gy (x) = | [(e? — 14’. 


d|n 
Theorem 3.3.3. The coefficients of &,(x) are integers. 


Proof. In the product []4,,(x* — 1)4(”/4) let us group the factors with 
ju = 1 and, separately, the factors with w = —1. As a result, we see that 
®, (x) = A} where P and Q are monic polynomials with integer coeffi- 
cients. The algorithm of polynomial division shows that ®, is a polynomial 
with rational coefficients. Therefore there exists an integer m such that m®,, 
has integer coefficients and the greatest common divisor of these coefficients is 
equal to 1. By Gauss’s lemma the greatest common divisor of the coefficients 
of mP = (m@,,)Q is equal to the product of the greatest common divisors of 
the coefficients of m®,, and Q, 1.e., is equal to 1. 

On the other hand, the greatest common divisor of the coefficients of mP 
is equal to m. Therefore m = +1, i.e., the coefficients of ®,, are integers. 


3.3.3 Irreducibility of cyclotomic polynomials 


In the preceding subsection we have shown that the coefficients of ,,(a) are 
integers. 
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Theorem 3.3.4. The polynomial ©,, is irreducible over Z. 


Proof. Suppose that &,, = fg, where f and g are polynomials with integer 
coefficients. Let « be a root of &,. We may assume that f(<¢) = 0 and f is 
irreducible. Let p be a prime relatively prime to n. Then ¢? is a root of @y. 
We want to prove that ¢? is a root of f. Suppose that ¢? is not a root of f. 
Then we may assume that ®, = fgh, where f and g are irreducible monic 
polynomials, f(¢) = 0 and g(e”) = 0. 

The polynomial x” —1 and the irreducible polynomial f(a) have a common 
root €, and hence x” — 1 is divisible by f(a). Similarly, x” — 1 is divisible by 
g(x). Since f and g are relatively prime, it follows that 2” — 1 is divisible 
by their product. Therefore the discriminant D of x” — 1 is divisible by the 
resultant R(f,g), cf. Theorem 1.3.4 on page 24. 

It is easy to verify that D = +n” (see Example 1.3.7 on page 25). To get 
a contradiction, it suffices to show that R(f,g) is divisible by p. We require 
the following lemma. 


Lemma. [f p is a prime and f(x) is a polynomial with integer coeffi- 
cients, then (f(x))” = f(a?) (mod p). 


Proof. Let f(«) = a,x2" +---+ a,x" +a. Then 
| 
(Fe) rm ana (ao) 


ko! - 
kote--+kn =p 


The number =a is not divisible by p only if one of the numbers ko, ..., kn 
Kol... Kn! y y 


is equal to p. Hence 


(f(x))” = (nz)? + +--+ (ao)? (mod p). 


Since a? = a (mod p) for all a, we get the statement desired. 


Returning to the main proof, we let yi = ©”, yo, ..., yr be the roots of 
g. By the Lemma, f(e”) = (f(e))’ =0 (mod p), i-e., f(y) = p(y), where 
w is a polynomial with integer coefficients. The polynomial f — pw and the 
irreducible polynomial g have a common root y;. Hence f — pw is divisible by 
g, and therefore f(y;) — pw(yi) = 0 for all i. Therefore 


R(f,g) =£F (yr) ----+ F(ye) = Ep* b(n): -..- bye). 


The expression 1(y1)-...- (yx) is a symmetric polynomial with integer co- 
efficients in the roots of g. Therefore this expression is an integer, i.e., R(f,g) 
is divisible by p*. 

Thus, if &, is divisible by the irreducible polynomial f and « is a root 
of f, then, for any prime p relatively prime to n, the number e? is also a 
root of f. Now it is easy to demonstrate that all the roots of @, are also the 
roots of f, i.e., f = +@,. Indeed, any root w of &, is of the form e™”, where 
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(m,n) = 1. Let us represent m in the form m = p;-...:ps, where pi,...,Ds 
are primes among which some may coincide. The condition (m,n) = 1 implies 


that (p;,n) = 1 for all i. Therefore e?!, ?!P?, ..., eP!"’P> = w are the roots of 


3.3.4 The expression for ®,,, in terms of &, 


In many cases the cyclotomic polynomial @,,,,(x) can be expressed in terms 
of &,,(”). We confine ourselves to the case when m = p is a prime. 


Theorem 3.3.5. Let p be a prime. Then 
@,(x?) if (n,p) = p; 

Pyn(x) = 4 Pn(z?) 

®,, (x) 


if (n,p) = 1. 


Proof. First, consider the case where n is divisible by p. If ¢ is a primitive 
root of degree pn, then w = e? is a primitive root of degree n. To the root 


w, the roots €1,...,€p correspond so that (w — €1):...-(w@—é€p) = a? —w. 
Therefore 
®yn(0) = [](e—e) = [] (2? - 0) = 2, (2"). 
€ w=EP 


Now consider the case where n is not divisible by p. In this case the divisors 
of pn consist of the divisors of n and their products by p. Hence 


Syn (XL) = he _ jer) = [[@’ - j)eerss) [le = eines, 
d|pn d\n d|n 


Since i (F) =p (5) it follows that 


Il (gp — 1)Ptnia) _ (2?) 
(ao — 1)H(n/d) 7 Dr (x) 


Pyn(Z) = 


d\n 


Using Theorem 3.3.5 we can compute ®,,(+1). Let us start with @,(1). If 
n is divisible by p, then by Theorem 3.3.5 &,(1) = ®n/p(1). Therefore, if n = 


pyi-...:pe*® and m = py-...- pr, then &,(1) = &,,(1). It remains to compute 
®,,(1). If m = p is a prime, then @,(1) = p. If m= p1-...- px, where k > 1, 
®,,(1) 


we set p= p; andn= = By Theorem 3.3.5 we have @,,(1) = = 
Pp 
Thus, if n > 1, 


SG. rip 
Mw VA ifn é p. 


Now let us compute &,,(—1). The following cases are possible: 
1) n> 1is odd. Then &,(—1) = Sg7(1) = 1. 
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gr —] 
2) n = 2". Then ©,(2) = a4 = (x"/? +1). Hence &,(—1) = 0 for 
Ht 
n= 2 and @,(—1) = 2 for n = 2*, where k > 1. 
3) n = 2m, where m > 1 is odd. In this case &,(—1) = @,,(1). Therefore 


@,(—1) = pif m = p® and ©,,(—1) = p if m has more than one prime divisor. 


4) n = 2*m, where k > 1 and m > 1 is odd. Let m= p{! - +--+ p?*. Then 
&,,(x) = Go,(x*), where r = py -...+ py and s = 2%-lp™ 1... p™~1, Hence 
®,,(—1) = %,(1) = 1. 


3.3.5 The discriminant of a cyclotomic polynomial 
Let us represent the cyclotomic polynomial ®,,(x) in the form 
Ona) =| [@*- 1" =e@=1. T] @-1. 
d|n d|n,dAn 
If ¢ is a root of &,, then 
i(e)=ne™? TT (64-19. 
d|n, d#én 
Therefore the absolute value of the discriminant of @,, is 


[[/%.©| = n¥(™) II ti t+ er peie/e),, 
€ 


d|n,dfn € 


Clearly, e@ is a primitive root of degree = ie., 


II a= 24) _ Gj 


€ 


deg &,, 
since ee oa ain) 


seta 9(3) 


The value of @,,/q(2) at « = 1 can be distinct from 1 only if “ = p*. On 


the other hand, ju (5) # 0 only if = is not divisible by the square of a prime. 


Hence there remain only the values of d for which “ is a prime. Thus, 


(n) 
WAGE ie 
€ Il p> 
pin 


It remains to determine the sign of the discriminant of ®,,. To this end, 
we use the fact that ®, has no real roots and its degree is equal to y(n). By 


Theorem 1.3.5 on page 24 the sign of a discriminant should then be equal to 
(=1)9™?, 
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3.3.6 The resultant of a pair of cyclotomic polynomials 


We begin by calculating the resultant R(®@,,,72” — 1). The polynomial 7” — 1 
is divisible by ; = « — 1, and hence R(®1, 2" — 1) = 0. 
Let n > 2. Let d = (n,m) and ny = "Let &,&2,... be nth primitive 


d 
roots of unity, and let 71,72,... be n,th primitive roots of unity. Then 


R(®q,2" — 1) = T(E" —1) = TA -g") = 
= Cute! = mi)) P/O _ (6, (errr, 


If ny = 1, ie., if m is divisible by n, then @,,,(1) = 0, and hence 
R(@,, 2 —1) =0. 


If ny #1, then ©,,,(1) = p for m1 = p* and ©,, (1) = 1 for mn # p*. 
Passing to the calculation of R(®,,2,,), we observe that it is an integer 
that divides both R(@,,«2™ — 1) and R(@,,, 2" — 1). Indeed, 


2” —1=4,,(2) (2), 
where f(a) is a polynomial with integer coefficients, and so 
R(@,, 2” — 1) = R(®,, On f) = R(@n, Pm) R(Pn, f). 
Further, ifn > m > 1, then 
R(Gm, Pn) = (—1)?MP R(Py, Gm) = R(Gn, Pm) 


and R(®,,,2,) > 0. The latter inequality can be proved, for example, as 
follows. Clearly, 
04 R(®,,2" —1) = |] R(P,, 4), 
d|m 
and hence 
R(G,, Pm) = || Rn, 24-1)" > 0. 
d|m 

If m is not divisible by n and n is not divisible by m, then the numbers 
m n nage : ‘ 
— and 7 where d = (m,n), are distinct from 1 and are relatively prime. 
Therefore R(®,, 7” —1) and R(@,,, 2” —1) are relatively prime, and therefore 
R(@,, Pm) = 1. 

To be definite, let m be divisible by n. If m =n, then R(@,,P,,) = 0. If 
“ £p, then R(®n,2” — 1) =1, and hence R(®,,Pm) = 1. It remains to 
n 


m 
consider the case — = p*. Clearly, 
n 


R(Gn, Pm) = |] RG, 2° — 16/9). 
é|n 
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On the right-hand side all the factors are equal to 1 except those for which 
m a 
a 
If (n,p) = 1, the factor distinct from 1 only appears for 6 = n. In this case 
R(Gn, Pm) = R(@m, 2" — 1) = permd/elm/n) — pol), 


If n is divisible by p, the factors distinct from 1 only appear for 6 = n and 
5 =“. In this case 
Pp 
R(By, x2” — 1 , 
' =p", 


R(@n, Pm) = R(Gm, 2"/? = iil 


™mp\ 
o(= 
n 


Thus, if m>n, 


0 ifm=n; 
R(@n, Om) = § p?™ ~~ if m= np’; 
1 otherwise. 


3.3.7 Coefficients of the cyclotomic polynomials 


The examples of polynomials ®,,(x) for small values of n show that their 
coefficients are 0 and +1. But this is not always the case. In [Su3] it is proved 
that any integer may serve as a coefficient of a cyclotomic polynomial. This 
proof is based on the following auxiliary statement. 


Lemma. For any positive integer t > 3, there exist primes py < po < 
+++ < pz such that py + po > pt. 


Proof. Fix t > 3. Suppose on the contrary that, for any set of primes 
pi < po <+++ < pe, the inequality p; + po < py holds. In this case 2p, < pi, 
and so between 2*—! and 2" there are less than ¢ primes. This means that 


m(2*) < kt, where (s) is the number of primes between 1 and s. 
By Chebyshev’s theorem (see [GNS], [Da2] or [Ch1]) a(x) > — where c 
naz 
ok 
is a positive constant. Hence —_ < kt, ie., c2* < k?tln2. For a sufficiently 
n 
large k, this inequality will be violated. 


Let t > 3 be an odd integer. Select primes p; < po < --- < pz so that 
pi + p2 > py. Set p = p;, consider the polynomial ®,,(x) modulo «?*!. For t 
odd, we have 
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Deady [ree = 1) 
Ppp. = r—1 [](2??5 —1) 


But x?'?i = 0 (mod a?t'), x?iPiP* = (0 (mod w?t'), and so on. Hence 


rane ne (1 — at) heed yet), 


We can select the plus sign here since ,,...p,(0) = 1. The inequalities 


Pit pj 2 pit p2> Pie =P 


imply that 


(1 —2?")....-(—2") =(1—2? —...—2?*) (mod g?*"), 
It is also clear that (1 —2)~' = (1+ a+---+2?) (mod 2?*'). Hence 

Dy .p, = (Lt aot--+2")(1— 2?! —.--—2") (mod 2?*). 
Among the monomials x?', x?'+!, ..., 2?! +?, the monomial x? = x? occurs 


for all i whereas the monomials x?~! and «?~? occur for all i 4 t. Therefore 
the coefficient of x? in ®,,...p, is —t + 1 and the coefficient of x?~? is 


Gis 440: 


As t runs over all the odd numbers starting with 3, the numbers —t + 1 
and —t+ 2 run over all the negative integers. 

To see that all positive integers can be coefficients of cyclotomic polyno- 
mials, consider the polynomial ©2,,...,,, where p; > 3 and p; + po > px. The 
number n = p,-...- pz is odd, and hence @2,(a) = &,(—ax). This means that 
the coefficients of x”, as well as those of «?~?, in ©, and @,, differ by a sign, 
i.e., the coefficients of x? and x?~? in Dg,,...p, are t— 1 and t — 2, respectively. 


3.3.8 Wedderburn’s theorem 


One of the most interesting applications of cyclotomic polynomials is the 
proof of Wedderburn’s theorem on the commutativity of finite skew fields. 
This proof is due to Witt [Wi2]. 

A skew field is a ring in which the equations az = b and xa = b are 
uniquely solvable for all a £ 0. 


Theorem 3.3.6 (Wedderburn). Any finite associative skew field R is com- 
mutative, i.e., it is a field. 


Proof. Let e, and eg be solutions of the equations az = a and za = a 
respectively. Then aeja = a? = aeza, and hence ae, = aéy and e, = eg = e. 
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Let us show that be = b for any b. Indeed, let za = b. Then be = xae = xa = b. 
Similarly, eb = b. 

Therefore the skew field R contains the identity 1. Consider the field F, 
generated by 1 € R. The skew field F is a linear space over Fy. Let r be the 
dimension of this space. Then R consists of p” elements. Let Z be the center 
of R, i.e., the set of elements of R that commute with all the elements of R. 
Clearly, Z is a field containing F,. Therefore Z consists of gq = p* elements. 
The skew field is also a linear space over Z. If the dimension of R over Z is 
equal to t, then R consists of gé elements. Therefore p” = q* = p*’. We wish 
to show that R= Z,i.e., t= 1. 

For any element x € R, consider its normalizer N, = {y € R| xy = yz}. 
Clearly, Nz is a skew subfield of R containing Z. On the one hand, the skew 
field N, is a linear space over Z, and hence consists of q? elements. On the 
other hand, R is a linear space (module) over N,, and hence gq! = (q*)* = q®, 
ie., d| t. 

In the multiplicative group R*, we consider for every element x the orbit 


Oz = {yxy~" | y € R*}. 


Clearly, O, consists of 
Ae g-1 
o.\-t=£ 
elements. The orbits of distinct elements either coincide or do not intersect. 
Hence, R* splits into the disjoint union of orbits and the orbit of every element 
from Z* consists of a single element. Therefore 


ct 
gq? = 1, 


g-1=(@-1)+>) (1) 
where the sum runs over the divisors d of t such that d < t (the equality d= t 
corresponds to the case x € Z*; such elements are separated and correspond 
to the summand gq — 1). 
With the help of the cyclotomic polynomial ®;(x) we will show that (1) 
is only possible for t = 1. Indeed, the polynomial «’ — 1 is divisible by &;(-). 
t 


= iL, 


Moreover, if d|t and d < t, the polynomial i is also divisible by ®;(z) 


d 
et — 
since in this case the polynomials «7 — 1 and &;(x) have no common roots. In 


t — 
aq are divisible by ;(q). Relation (1) 

qt — 
then shows that g — 1 is divisible by ®,(q). On the other hand, 


I@:(Q|=][la-ed >¢-1 


particular, the numbers q’ — 1 and 


since |e;| = 1 and ¢; £ 1. 
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3.3.9 Polynomials irreducible modulo p 


With the help of Mobius’s inversion formula we can obtain an expression for 
the number of irreducible monic polynomials of degree n over F,,. Let us prove 
first the following statement. 


Theorem 3.3.7. Let Fu(x) be the product of all irreducible monic polynomials 
of degree d over Fy. Then 


a 7 = [[ 2a). 
d\n 


Proof. The polynomial x?” — x is relatively prime to its derivative 
pra?" —-1 —1 = —1, and hence it has no multiple roots. Therefore it suffices to 
prove that if f(a) is an irreducible monic polynomial of degree d, then f(x) 
divides x?” — x if and only if d divides n. 

Let a be a root of f and K =F,(qa) an extension of degree d of F,. This 
extension consists of p’ elements and all its elements satisfy the equation 
a? —7=0. Indeed, the multiplicative group of F,, is of order pp? — 1, and so 
any nonzero element x € F,, satisfies aP-1 = 1, 


Lemma. a) Over an arbitrary field, the polynomial x” —1 divides x™—1 
if and only if n divides m. 

b) Ifa> 2 is a positive integer, then a” — 1 divides a™ — 1 if and only if 
n divides m. 


Proof. a) Let m= qn +r, where 0 <r <n. Then 


ei" — 1 xz —1 
gr —1 a —1L gr 1 


The polynomial 72” — 1 is divisible by x” — 1. Hence «2 — 1 is divisible by 
x” — 1 if and only if x” — 1 is divisible by 7” — 1. But r < n and so 2” — 1 is 
divisible by 7” — 1 only for r = 0. 

b) is similarly proved. 


Let us continue with the proof of the theorem. First, suppose that d divides 
n. Then p% — 1 divides p” — 1, and hence x?" — x divides x?" — x. A root a 
of the irreducible polynomial f(a) is also a root of the equation a?" =x, and 
hence f(x) divides x?“ — «. 

Suppose now that f(a) divides 2?” —x. Then a?” — a = 0. For an arbitrary 
bat! + boat? +--- + bg € K, we have 


(bat ses 4b? jhe? os bg = that ios by 


Thus any element of K satisfies the equation x?” = 2, i.e., 2?” —« is divisible 
by a?" — x. Hence n is divisible by d. 
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Let Nag be the number of irreducible monic polynomials of degree d over 
F,,. Then the degree of Fy is dNa, and hence 


p" = a dNgq. 


d|n 


Applying Mobius inversion formula we obtain a formula for N,,: 
1 n 
Nn == Sou (=) vt. 


In particular, N,, 4 0 since the sum Yaln m (4) p@ is of the form 


po ep? erp 


where the numbers d; are distinct. 


3.4 Chebyshev polynomials 


3.4.1 Definition and main properties of Chebyshev polynomials 


Chebyshev polynomials T;,(x) constitute one of the most remarkable families 
of polynomials. They often appear in various branches of mathematics — 
from approximation theory to number theory to topology of three-dimensional 
manifolds. We will discuss several simpler but rather important properties of 
Chebyshev polynomials. 

We use the definition of Chebyshev polynomials based on the fact that 
cos ny is polynomially expressed in terms of cos y, i.e., there exists a polyno- 
mial T,,(x) such that T,,(a) = cosny for « = cos y. Indeed, the formula 


cos(n + 1)y + cos(n — 1)y = 2 cos pcosnyp 
shows that the polynomials T;,(a) recursively defined by the relation 
Tr+i(x) = 2xT,(x) — Ty-1(2) 


with the initial values To(a) = 1 and T\(x) = x, possess the property required. 
These polynomials T;,(x) are called the Chebyshev polynomials. 

The fact that T;,(”) = cos ny for x = cos ¢ directly implies that |T;,(x)| < 1 
for « < 1. The above recurrence implies that 


Tn(x) = 2° a” + aya" +--+ + an, (*) 


where aj,...,@p, are integers. 
The most important property of Chebyshev polynomials was discovered 
by Chebyshev himself. It consists of the following. 
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Theorem 3.4.1. Let P,(x) = 2" +--- be a monic polynomial of degree n 


1 Tal 
such that |P,(a)| < al for |x| <1. Then P,(x) = a) In other words, 


T, 
ne) is the monic polynomial of degree n that has the least 


the polynomial 


deviation from zero on segment [—1, 1]. 


Proof. We use only one property of the polynomial (*), namely, the fact 


that : 
Tn (cos (=)) = coskm = (—1)* for k =0,1,...,n. 


Consider the polynomial 


Q(x) = Tale) — P, (2). 


Its degree does not exceed n — 1 since the leading terms of 54+7T;,(x) and 
P,,(x) are equal. Since |P,(x)| < sr for |x| < 1, it follows that at the point 
tp = cos (££) the sign of Q(xx) coincides with the sign of T;,(x,). Therefore, 
at the end points of each segment [a,41, 2%], the polynomial Q(a) takes values 
of opposite signs, and hence Q(x) has a root on each of these segments. 


Uk+1 “Lk Uk-1 Tk+1 Lk Uk-1 


FIGURE 3.1 


If Q(x.) = 0 we need a slightly more accurate arguments. In this case either 
xp is a double root or within one of the segments [7,41, 2] and [ap, 2~—1] there 
is one more root. This follows from the fact that at 7,4, and x,_ 1 the values 
of Q(x) have the same sign (Fig. 3.1) 

The number of segments [2441, @%] is equal to n, and hence the polynomial 
Q(z) has at least n roots. For a polynomial of degree not greater than n— 1, 
this means that it is identically zero, ie. Pn(x) = s4++T, (2). 


If z = cosy + ising, then z+ 27-1 = 2cosy and 27 +2-" = 2cosny. 
Therefore 
(=) ze4egn 
Tn = ———_. 
2 2 


Using this property we can prove the following statement. 
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nr 


Theorem 3.4.2. Let m = [5 . Then 


n —n n n - n a n-i, a 
et (ety) + e-wr= (4) + ne = 
i=0) 
_ 25° (; ) 25 925 Lae ‘ eee? 1) 
j=0 SY j=0 \“4 


It remains to observe that T,() = $(2” + 27”). 


Corollary. Let p be an odd prime. Then 
T,(x) = T(x) (mod p). 


Proof. We write p= 2m+ 1. Then 


If 7 > 0, then es) is divisible by p. Therefore 


T,(z) =a? (mod p)=2z (mod p) =7;(z2). 


For any pair of polynomials P and Q, define their composition naturally, 
by setting 
Po Q(z) = P(Q(z)). 


The polynomials P and Q are said to commute if PoQ = Qo P, ie., if 
P(Q(a)) = Q(P(a)). 
Theorem 3.4.3. The polynomials T,,(x) and T,(x) commute. 


Proof. Let « = cosy. Then T,,(%) = cos(ny) = y and Ty,(y) = cos m(ny), 
and hence Ty, (T;,(a)) = cos mny. Similarly, T, (Tim(a)) = cos mny. Hence the 
identity T, (Im(2)) = Tm (Tn(#)) holds for |a| < 1, and therefore it holds for 
all x. 


Chebyshev polynomials are the only non-trivial example of commuting 
polynomials. Indeed, the following classification theorem for pairs of commut- 
ing polynomials holds. Let I(x) = az +b, where a, b € C and a £0. We will 
say that the pair of polynomials ] 0 f ol~! and lo gol! is equivalent to the 
pair of polynomials f and g. 
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Theorem 3.4.4 (Ritt). Let f and g be commuting polynomials. Then the 
pair (f,g) is equivalent to one of the following pairs: 

1) 2” and ex”, where e™-! = 1. 

2) +T,(x) and T,,(x), where T,, and T, are Chebyshev polynomials; 

3) 61Q) (x) and e2QM (x), where ef = ef = 1, and Q(x) = xP(x%), and 
where QM = Q, Q@ =Q0Q, Q® =Qo0Q0Q, and so on. 


This theorem was proved in 1922 by the American mathematician Ritt; 
all the known proofs of it are rather complicated. A modern exposition of the 
proof of Ritt’s theorem is given in [Prd4]. 

Sometimes instead of T,,(x) it is convenient to consider the monic polyno- 


mial P,,(x) = 2T;, (5) . The polynomials P,,(a) satisfy the recurrence relation 


Pr4i(&) = @Pp (x) — Pr—i(2). 


Hence P,,(x) is a polynomial with integer coefficients. 
If z =cosy+isiny = e’”, then z+ z~! = 2cosy and z"+z—" = 2cosny. 
Therefore 


P,(z + 27+) = 2T,(cos y) = 2cosny = 2" +27", 


ie., the polynomial P,(x) polynomially expresses 2” + z~” via z+ 271. 
With the help of the polynomials P,, we can prove the next theorem. 


Theorem 3.4.5. If both a and cos(am) are rational, then 2cos(am) is an 


integer, 1.e., cos(am) = 0, 4 or +1. 


m 
Proof. Let a = — be an irreducible fraction. Set x9 = 2cost, where 
nr 
t = an. Then 


Py, (a0) = 2 cos(nt) = 2 cos(nam) = 2cos(m7) = £2. 
Hence 29 is a root of the polynomial with integer coefficients 


Pi(a) $2 Sa" + bye 1 ++ + By. 


Let wo = 2cos(a7) = P be an irreducible fraction. Then 
q 


p” + bp" "q+---+brq” =0, 


and hence p” is divisible by g. But p and q are relatively prime, and so qg = +1, 
ie., 2cos(a7r) is an integer. 


It is convenient to compute the derivatives of Chebyshev polynomials start- 
ing directly from the relation T,,(x) = cosny, where x = cosy. For example, 
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dcosnyp 
dy nsinnyp 
pe f SS SS SS 
n() dcos 2) sin yp 
dip 
T"(x) d /nsinnp\ —1 ncosysinny — n? cosnysiny 
ESS Ss —_—_—_—- _ ere... 
0 dp \ sing / sing sin? 


These formulas imply that 
(1 — 2?)T,, (x) =n (Tn-1(2) — 2Tn(2)) , 


(1 — 2?) (Ti (a))” = n? (1— Tr(a)”, 
(1 — 2?) T" (x) — T(x) + n?T, (x) = 0. 


The identity 
(1 — 2?) (Ti (a))” = n? (1 — Tr(a)? 


can be rewritten in the form 
L=7 (2) -1-2")\07 ©), (1) 


sinn . : : 
where U,,(a“) = — ? and x = cos y. It is easy to verify that U, is a polyno- 


sin y 
mial with integer coefficients. Indeed, induction on n shows that 


sinnx = pp(cosx) sinx, cosna = q,(cos 2), 


where p,, and qd, are polynomials with integer coefficients. 
Identity (1) can be used to solve Pell’s equation 


x? — dy? = 1. 
Indeed, if (x1, yi) is a positive integer solution of this equation, then 
1 = 7360) - 2 (nuy(or)) 
1 
= Tp (#1) — d(yUn(a1))’ ; 
so that (T,(x1), yiUn(a1)) is also a positive integer solution of this equation. 


Remark. One can prove that if (1, y1) is the least positive integer solu- 
tion of Pell’s equation, then all its positive integer solutions are of the form 
(Tn (x1), y1Un(21)). 
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3.4.2 Orthogonal polynomials 


The polynomials f;,(2), k = 0,1,... are said to be orthogonal polynomials on 
the interval [a,b] with weight function w(x) > 0 if deg fy = k and 


b 


a 


for m#n. 
In the space V"*+! of polynomials of degree < n, we define the inner 
product by 


b 
(f.9) = / ‘jaya 


The orthogonal polynomials fo, fi, ..., fn form an orthogonal basis in the 
space V"+t! with this inner product. 

Given an interval [a,b] and a weight function w(x), the orthogonal polyno- 
mials are uniquely determined up to proportionality. Indeed, they are obtained 
by orthogonalization of the basis 1,2, 27,... 

The best-known ones! are the following orthogonal polynomials: 


| a |b[  w(r) [Name of the polynomial 


Gegenbauer 


Jacobi 
Hermite 
Laguerre 


Theorem 3.4.6. Chebyshev polynomials form an orthogonal system on the 


V1— 2? 


Proof. Making a change of variable x = cos y, we have 


interval [—1,1] with weight function w(x) = 


a 7 
dx 
[ P@)Tnlo) = [ cosnpeosmy dp = 
Jl — 2 
aes a 0 


Tv 


_ i cos(m + n)y < cos(m — ae 


0 


' Bochner showed that up to a complex linear change of variable, the only poly- 
nomial solutions that arise as the eigenfunctions of the hypergeometric equation 
are the Jacobi, Laguerre, Hermite and Bessel polynomials, of which Legendre, 
Gegenbauer and other famous polynomials, e.g., Tchebyshev ones, are particular 
cases, see [BCS]. 
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It remains to observe that 


Tv 


J coskedy =0 ifk £0. 
0 


Corollary. If P,,(x) is a polynomial of degree n and 


1 
akda 
P,, (2) —=—=_ = 0 
| ojo 


fork =0,1,...,n—1, then P, (av) = AT,(a), where X is independent of x. 
Proof. In the space V"++ with inner product 


1 


(f.9) = _: f(a)g(e) 


-1 


dx 
JVI — 22’ 


the orthogonal complement! of the space generated by the polynomials 
1,2,x?,...,2"~+ is spanned by the Chebyshev polynomial T;,(«). 


The corollary of Theorem 3.4.6 is often convenient in proving that a given 
polynomial is indeed a Chebyshev polynomial. For example, using the Corol- 
lary we prove the following statement. 


Theorem 3.4.7. Chebyshev polynomials can be defined by the formula 


Tyla) = Fate 


Proof. By induction on m we easily prove that for m <n 


d™ 


drm (1— a = P,,(x)(1 — 2 aia 
x 


where P,,() is a polynomial of degree m such that 


Hence 


is a polynomial of degree n. 


' Recall that the orthogonal complement of the subspace V C W consists of all 
vectors in W which are orthogonal to V. 
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Let us verify that P,(x#) = AT) (2), ie., 
. ad” 
[ea _ a2)" de —0 
x 
for k = 0,1,...,n—1. Integrating by parts we obtain 


1 

[ope — 27)" -V2dy = 
da” 

“1 


= a® P,_1(x)(1 = a)i/? 


1 

1 : qd™-1 

| — f kat a (1 —@?)"-/? de. 
4 


-1 


The first term on the right vanishes since 1 — x? = 0 at x = +1. Then we 
integrate by parts the integral term and repeat the process. In order to obtain 
0 at the end, we have to integrate by parts k+1 times. At the last step we get 


m—k—1 
the derivative era This means that n — k — 1 should be non-negative, 
i 
ie, kh<n-1. 
It remains to verify that A = (—1)"1-3-5-...-(2n—1). For this, one can 


compute P,,(1). Indeed, for 7 = 1, the recurrence 


Pasi(x) =1—a7 — (Qn -—2m—1)cP,(z) 


takes the form 
Pm4i(1) = —(2n — 2m — 1)P,,(1). 


Therefore P,,(1) = (—1)"1-3-5-...:(2n—1). It is also clear that T,,(1) = 1. 


3.4.3 Inequalities for Chebyshev polynomials 


We have shown that Chebyshev polynomials only slightly deviate from zero 
on the interval [—1, 1]. This is compensated by the fact that these polynomials 
and their derivatives grow rapidly outside this segment. More precisely, the 
following statement holds. 


Theorem 3.4.8. [Rol] Let the polynomial p(x) = a9 +a,"+--:+ay,2", where 
a; € C, be such that |p(x)| <1 for -1<a <1. Then |p) (x)| < |Te” (a)| 
for |z|>1,2ER. 

Proof. We use only the fact that |p(x:)| <1 for x; = cos (ir where 
i=0,1,...,n. The polynomial p(x) is completely determined by these values 
p(a;). Indeed, 


“\ p(x) 
(a) = 2 a(X), (1) 
a) = do ea? 
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where g;(x) = [| (a —2;). By differentiating (1) k times we obtain 
jFi 


p(x = a p(x g(x 


Since |n(zi)| <1, it follows that 


Ck) ( a)| < 
1=0 


|p 


The value of T,(x) at x; is cos(n — i)m = (—1)”"~*. Hence 


2 C9! (0) : 


Tbk) -_ 
| n (x)| = gi(Xi) 


It is also clear that sgn g;(aj) = (—1)"~*. Further, for |z| > 1, the sign of 
gs" (x) does not depend on i. Indeed, all roots of g(a) belong to [—1, 1], and 
hence all the roots of gs” (a) also belong to this interval. Therefore 


song (a) = 1 fora >1 
een VET ® dere < 1; 


As a result for || > 1 we obtain 


CR) (ap 


In this case inequality (2) implies that ne < — 


Theorem 3.4.8 yields several useful corollaries. We formulate them as sep- 
arate theorems. 


Theorem 3.4.9. Let p(x) = agp +a,a+-+:+a,x", where a; € C, be such that 
|p(z)| <1 for -1<a2<1. Then |a,| < 27-1. 


Proof. Recall that T,(z) = bo + bia +--+ + bpx”, where b, = 2771. 
Therefore, applying Theorem 3.4.8 for k = n, we deduce that |a,| < |b,| = 
gn 1 


Theorem 3.4.10. For x < —1 and x >1, we have 
Te 1@)| < |L@)). 


Proof. For the polynomial p(x) = T,-1(), the conditions of Theorem 
3.4.8 hold, and hence |T'*), (x)| = |p(x)| < |T) (x)]. 
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Theorem 3.4.11 ([As]). For a,y >1, we have 
Tr(zy) < Tr(2)Tn(y)- 


TAs) 


Proof. Fix y > 1 and consider the polynomial p(x) = . Let us 


n 
verify that this polynomial satisfies the condition of Theorem 3.4.8, ie., 


|p(z)| = [Tn(ey)| < 1 for |x| < 1. For real s, the function |T;,(s)| only 


Tn(y) 
depends on |s|. Moreover, if |s| > 1, then |Z;,(s)| monotonically increases 


with |s|. Clearly, |T,.(s)| <1 < T,(y) for |s| < 1. Therefore, if y > 1 and 
|z| < 1, we have |T,(zy)| < Tn(y). 

By Theorem 3.4.8 for « > 1, we have |p(z)| < T,(2), ie, Tr(ay) < 
Tn(t)Tn(y)- 


3.4.4 Generating functions 
For a sequence of functions a,,(a”) one can consider the series 
[oe) 
F(a,z)= > An(a)z”. 
n=0 


If the radius of convergence of this series is positive, the function F(x, z) is 
called the generating function of the sequence a(x). 


Theorem 3.4.12. For —1 <a <1 and |z| <1, we have 


(a) 2 3 Tn(z) 2” = —In(1 — 22z + 2”); 


n 
n=l 
(b) 14252 Ty (a)2" ae 
(22° =—————.. 
= 1 — 2ez+4+ 2? 


Proof. a) Let x = cosy. Then 
1—2¢2+27 =(1—e'’z)(1-—e-”2). 


Hence In(1 — 2az + 27) = In(1 — e’¥z) + In(1 — e~*¥z). It is also clear that 


: eine 
—In(1 — e*’” = n 
n(1 — e*"*z) d ran 
for |z| < 1. Hence 
\ 2cosny Tr(a) 
—1 1 = 2 2 = —___ gt => 2. a 
n( Lz + 2°) 2d 7 d ae 
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b) By differentiating both parts of (a) with respect to z we get 


= 2a — 2z 
2S) T,(z)z"* = —————.. 
d ee 1 — 2Qarz + 2? 
Therefore 
z(2x — 2z) 1-2? 


eo) Pits i4 oS 
7: 2d ee v7) ogee 1 — 2Qaz+ z? 


With the help of Theorem 3.4.12 we can obtain the following explicit 
expression for Chebyshev polynomials. 


Theorem 3.4.13. Let n > 1 andm= [5]. Then 


Tala) = 5 ("| en 


Proof. By Theorem 3.4.12 (a) 


- T(z) — 2) _ = (Qxrz— z7)P = 
2d ail ia a , = 
n= p= 
co Op 1 
= cors (Plerrane. 
p=1k=0 P 
Hence 
1 n/p = 
Ta(a)= 5 (12 (Fay * = 
ptk=n zp 
a ( yk n n—k 2 yn—2k 
= 9 ras n—k k 


The summation is performed until n — 2k > 0, and hence M = B =m. 


ax 


For the polynomial P,(«) = 2T; (5 


namely: 


7 we get a neater explicit formula, 


P,(z) = Deo (" :) nk, (1) 


n 
where m = [5 ; 


Recall that the polynomial P,,(x) corresponds to the polynomial expression 
of 2" +27” in terms of z + z~/ (this follows from the fact that, for z = e’®, 
we have z” + z—" = 2cosny and z+ z~! = 2cosy). 
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It is easy to verify that for n = 2m+1 


(242% = aS @ gaa), 


k=0 


and for n = 2m 


crepe (Deterene(2) 


k=0 


Therefore, if Po(x) = 1 and the polynomials P,,(x) for n > 1 are given by (1), 


then - 
=> (7) Prato), (2) 


k=0 


n 
i = =) ; 
where m7 2 


Relations (1), (2) can be expressed as follows. Let a, = x" and b, = P, (2), 
where « is fixed. Then 


Qn = = (j,) boa Divs ee ” 7 (" - *) er (3) 


k=0 k=0 


(for n = 0 the second relation takes the form bo = ao). Let us prove that the 
relations (3) are equivalent not only for the indicated sequences but also for 
arbitrary sequences. 

First of all, observe that the first relation is of the form 


an = bn ae » Bn—Wn—i 
and the second relation is of the form 
bn = An + S- An—-idn—-i- 


Hence each relation uniquely determines both the sequence a,, in terms of the 
sequence b,, and vice versa. It is also clear that for the sequences 


dn = )> Asa} and bn = S> AiPn(ai), where the A; and 2; are fixed, 


the relations (3) are equivalent because they are equivalent for the sequences 
An = x? and by, = P,(2:). 


It remains to verify that for any sequence do, @1,...,@, we can select 
numbers Xg,...,An and x,...,%, so that 
n 
a= > Ac} for 1 =0,1,...,n. (x) 


i=0 
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Select arbitrary distinct numbers xo,..., Zn. Then we obtain a system of linear 
equations for the numbers Ap,...,An with the determinant 


hence the system (*) has solutions for any ao,..., Qn. 
Relations (3) enable us to obtain nontrivial identities involving binomial 
coefficients. Let, for instance, b,, = 1 for all n. Then 


“(2m 
A2m+1 => Ss ( a ) = os, 


k=0 
 (2m+1 io 2m 
m= — eel 2 ue ); 
vm = CO) a+ Cn) 
k=0 
These identities are easy to derive from the expansions of (1 + 1)?"*! and 


(1 +1)?” via the binomial formulas. In this case the relation 


= n n—k 
bn = Yeo ( k ) anak 


te . Wm+1 (2n+1—k 
= Say ZBL (P41 Woe 
= ImMm+1—k k 
” Im (2n—k\1 Im — 2k 
a= =] k = Cali ). 
d ) al k )3 a m—k 


3.5 Bernoulli polynomials 


3.5.1 Definition of Bernoulli polynomials 


Consider the function 
t) 7 tel 
g(z,t) = Fs: 
For t = 2k7i the denominator vanishes, so that at such values of t the function 
g(z,t) may have singularities. But g is regular at ¢ = 0, and therefore we can 
expand g(z,t) into the series 


Co 


alz.t)= >> Bale) 


n=0 


which converges for |t| < 27. 
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As we will see shortly, B,(z) is a polynomial of degree n. The B,(z) 
are called Bernoulli polynomials and the numbers B,, = B,,(0) are called 
Bernoulli numbers. 

The series for g(z,t) is the product of the series 


t” ¥s Th %t 
g(0, t) = “Pn and e* = al 

n=0 n=0 

Hence ee 
n! ee: k(n — kV 

i.e., 

B,(z) = . : Bn_n2". 

k=0 Q 


Formally, this identity can be expressed as B,(z) = (B + z)", where by defi- 
nition B’—* = B,_,. 
One of the most important properties of Bernoulli polynomials is as fol- 


lows: 
Bn(z+1) — Ba(z) = nz". (1) 


To prove (1), it suffices to observe that 


Co 


¢@ 
>> (Buz + 1) - Br) at ote +1) — oft, 2) = 
n=0 , 
L t(z+1) t tz co pnt n 
_ € _ e€ = tet? = y z 
ef —1 ef — 1 om n! 
Let us sum the identities (1) for z = 0,1,...,m-—1. As a result we obtain 
m-1 1 
kh" = —(Bp(m) — Bn(0)). (2) 
k=0 


This means, in particular, that the sum 
1+2%-14...4(m-—-1)*"! 


is a polynomial of degree n in m. This is precisely the property that J. Bernoulli 
discovered [Be6]. 
In 1738, Euler suggested the generating function 


[oe} 


tet tr 
a s° 1 Bul2)- 


n=0 
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It is convenient to compute Bernoulli polynomials from the recurrence 


formula 
n—-1 


>. (") B,.(z)=n2"}, n>2. (3) 


r=0 
This formula can be proved as follows: 


a te , ea 28 
S Sacern= goo = (Yat) (Ps), 
n=0 r=0 s=0 

and hence 


B,(z+1)= » (") B,(z) = By(z) + 2 (") B,(z). 


It remains to use (1). 
When z = 0 the relations (3) become a recurrence formula for the Bernoulli 


numbers 
n-1 rs 
> (")B-=0, n>2. (4) 
7 


r=0 
It is easy to verify that Bp = 1. Therefore from (4) we recursively obtain 


1 1 1 
Bi =--, Bo=-, Bs =0, By=-— 
1 2° 2 re 3 ’ 4 30’ 


It is not difficult to show that Bo,.1 = 0 for k > 1. Indeed, 
t fe ee 
— =l1-- —By. 
e' —1 2 = ps n! 


Hence, it suffices to verify that the function 


t t ef+1 


ef — 1 = 2 e' —1 
is even, and this is evident. 
In 1832, Appel showed that the Bernoulli polynomials satisfy the relation 


Bhai (z) = (n+ 1)Bp(z). 


To prove this, we differentiate the identity 


Bs =0, 


co 
a te 
> a Pn (2) = ta 
on et — 
with respect to z. We obtain 
co lo) 
t” tet prvi 

SB) = = Bal) 
n=0 n=0 


Equating the coefficients of t?+! we get Appel’s relation. 
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3.5.2 Theorems of complement, addition of arguments and 
multiplication 


Bernoulli polynomials possess the following properties 


B,(1— 2) = (—1)"Bn(2) 


By(a+y) = Xs (") B,(x)y"* 


m1 


it > B, («+ =) 9 Bena) 


All these theorems are easy to deduce from the relation 


t” t tz 
YF] 
ni 


or (i 
mao e 


(Theorem of complement); 


(Theorem of addition of arguments); 


(Theorem of multiplication). 


To prove the first property, we observe that 
oo tn 


tet(-2) —te—% oo (—t)” 
ao) e' —1 e-t—1 =e n! B{2). 
n=0 n=0 
The second property, on the addition of arguments, is proved as follows 
Oo gn tet ety oS #8 oo t’y” 

> pBn(t + y) = = le (> a) (> a 

n=0 s=0 r=0 
lo e) r+s P co n #” n poe 
= De arate = Da) ew 


To prove the multiplication property, we use the identity 


1 l+eft--- + elm vt 
ef—1 emt —] 
This gives 
tn temtz 1 t1 Bale eb (m—1)t 
> B,lm2) € ee mt(Ll+e'+---+e ) 
er n! —-l mm emt —] 
eS su 1 ml co mt” k 
2 com EY ya, (4). 
k=0 =0 n=0 


The multiplication property shows that B,(x) is a solution of the func- 
tional equation 


Sf (2+ 75) =m (1) 
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Theorem 3.5.1 ({Le1]). For fixed m,n > 1, there exists only one monic 
polynomial of degree n satisfying (1). 


Proof. The existence of the polynomial required can be proved directly, 
but instead we give a proof which uses the fact already known to us, namely, 
that Bernoulli polynomials satisfy the functional equation (1). 

Let p(x) = a" +--+ and g(x) = 2” +--- be two distinct polynomials 
satisfying (1). Their difference A(x) = agx4 +--+, where ag # 0 and d < n, 
also satisfies (1). Comparing the coefficients of x“ on both sides of the equation 


m—-1 


< d, A (: + «) =m-"A(mz) 


we see that ag = m?—"ap. This contradicts the condition that m > 1 and the 
assumption that ag 4 0 and d <n. 


Making a change of variables we can reduce (1) to the form 


m—1 


Fle) = mi" 5 (EEE), (2) 


where s = n. Kubert [Ku] studied continuous functions f : (0,1) — C satisfy- 
ing (1) with s a positive integer. In the more general case s € C, the space of 
such functions is two-dimensional and one can select a basis feven, foda in it 
so that 


Feven(£) = faved(l — x) and foaa(x) = —foaa(1 _ x). 


John Milnor wrote a long and interesting paper [Mi3] on various properties of 
solutions of (2). 


3.5.3 Euler’s formula 


CO 
Let s € C and Res > 1. Then the series }> 4+ converges. The function 
n=1 


G(s) = = 


has an analytic continuation to the whole complex plane C. This continuation 
has a simple pole at s = 1 and is regular elsewhere. 
co 
It was already Euler who considered the series 5> + at integer points s. 
n=1 
But the first to consider ¢(s) as a function of a complex variable was Riemann, 
and it was Riemann who discovered the most profound properties and most 
important applications of ¢(s). This is why ¢(s) is called the Riemann zeta- 
function. 
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Theorem 3.5.2 (Euler). If k is a positive integer, then 


(—1)*+1 Bo,224-14724 


Proof. We use the factorization of sin z into an infinite product 


COS Z 1 2z 
sinz z De 22 — nen?’ 
n=1 
i.e., 
COs z ef 2 2K 
ste0 (—) 
. sin z 2 » nt (1) 
n=1k=1 
On the other hand, substituting t = 22z into the identity 
t €.. ee m 
et —] =1-5+)) Bm 1? 
m=2 
we obtain “ 
cosz . e* +e" (2iz)™ 


2 
Comparison of the coefficients of z?* in (1) and (2) yields the identity de- 
sired. 


Euler’s formula for ¢(2k/) makes it obvious that ¢(2k) is a transcendental 
number since 7 is transcendental. There is no similarly convenient formula for 
¢(2k + 1). It was only in 1978 that R. Apéry proved the irrationality of ¢(3). 
The simplest (known to me) proof of irrationality of ¢(3) is given in [Be7]. 


3.5.4 The Faulhaber-Jacobi theorem 
Mathematicians were interested in the summation of the series of powers 


ee Qe he te au 


long before Bernoulli. In 1617, the German mathematician Johann Faulhaber 
(1580-1635) published a book in which he gave sums of such series for m < 11. 
In 1631, in the book [Fa], he extended his calculations up to m = 17. 

Pierre Fermat also studied the summation of such series. In 1636 he wrote 
to Mersenne: 
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“’.. We do not want to dwell on this here, let us just mention that we 
have obtained a solution, perhaps the simplest in the whole arithmetics of the 
problem thanks to which we can not only find the sum of squares or cubes of 
any progression, but in general the sum of all powers up to infinity thanks to 
the most general method; squares of squares, squares of cubes, and so on.” 


Many historians of mathematics are inclined to believe that Fermat really 
obtained the solution of this problem almost a century before Bernoulli. 

Faulhaber, in his book [Fa], observed that all the sums }*n* can be poly- 
nomially expressed in terms of the first two sums )>n and )> n?. Two hundred 
years later, in 1834, Jacobi rediscovered Faulhaber’s theorem. It is known that 
Jacobi possessed Faulhaber’s book but it is not known whether he read it or 
not. 

For convenience, in the proof we introduce the polynomials 


Sn—1(m) = — (Bn(m) — Br(0)). 
Formula (2) on page 113 shows that 
Spn(m) = 1" +2" 4+---+(m—1)". 


Theorem 3.5.3 (Faulhaber-Jacobi). Let U = S;(x) and V = S9(x). For 
k > 1, there exist polynomials P, and Qy with rational coefficients, such that 
Son41(2) = LPP) and Son (x) = VQz(U). 


Proof. To obtain the expression for $2.41, we use the identity 


(n(n — 1))" = Dye (m"(m + 1)" — m"(m — 1)") = 


m=1 


(1) 
=2(() Dm" + @) om + G) ome +.) 


(n(n—1))* =S° € ee " 1) Saal) 


i—j) 
These identities can be expressed in the matrix form 


n?(n — 1)? 200... S3(n) 
n3(n — 1)3 13.02. S5(n) 
n*(n — 1) ) 


41 =2]044...] | Sr(n 


200... 


130... 
In the infinite matrix | 9 4 4... | all the principal minors of finite order are 


invertible, and hence the matrix itself is invertible and we can write 
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se). (S68 
5(n 1 “4 n(n — i+] 
Se(n) | = 3 Jagy | Ut | where aa = e —j)+ :) 


This formula shows that S';,41(m) can be expressed in terms of n(n — 1) = 
2U(n) and is divisible by (n(n —1))’. 
To obtain an expression for S2;, we use the identities 


n(n — 1)" = Dyna (m" (m + 1)? — (m — 1") = 
=Dm™ (PH) + @) + om (2) -@) + 
+ Dm-2((H) +) + om ((CP)-)) +05 
=((2)+@) cmt (CP) + @) om... 


+ (1) mr} + 8) m2r-3 +... 


The sums of odd powers can be eliminated with the help of (1). As a result 


we obtain 
r+1 a ua n"(n 7 1)” r+1 r 2r 
nr (n-1)"= a ae + ( 1 + 1 ) ) mr 


(Ca) Geta 


rena (72) = O(a ye) + (ogy 41) 


Now, similarly to the above, we obtain 


So(n) n(n —1) 
Sa(m) | an—-1 1—=«[ m2(n—1)? 
Solr) | = 3 Jog | me YD? 
where bi; = (ee + es aaa): Simple calculations show that 


_ 2n-1 n(n—1) 


Sa(n) 5} 3 


Sor (n) 
So(n) 


Therefore the polynomials S4(n), Sg(n), ... are divisible by S2(n), and 


is a polynomial in n(n — 1) = 2U(n). 
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3.5.5 Arithmetic properties of Bernoulli numbers and Bernoulli 
polynomials 


In this section we prove several theorems on the denominators of the values of 
Bernoulli polynomials at rational points. The most interesting are the values 
of Bernoulli polynomials at 0, i.e., Bernoulli numbers. 

When we formulate statements on denominators of rational numbers it is 
convenient to use the notion of p-integers. If p is a prime, the rational number 
r is said to be a p-integer if p does not enter the denominator of r, i.e., if the 


denominator t of the irreducible fraction = = r is not divisible by p. 


For a rational r, the expression r = 0 (mod p) will mean that the numer- 


ator s of the irreducible fraction = = r is divisible by p. It is easy to verify 


that if ry = r2 =0 (mod p), then rire = 0 (mod p) and r; +r2 = 0 (mod p). 
Moreover, if 7; is a p-integer and rz = 0 (mod p), then r1r2 = 0 (mod p). 


Theorem 3.5.4 (Kummer). Let p be a prime. If the positive integer n is 
B 

not divisible by p—1, then — is a p-integer and 
n 


Proof. The multiplicative group of the field F,, is cyclic, and hence has a 
generator. This means that there exists a positive integer a lying between 1 
and p for which a* 41 (mod p) for k = 1,...,p — 2. Consider the function 


A(t) = —— - —=— = (a - Be = 


II 
06 
= 
| 
>| 
| 
7 | 
id 
= 
= 
| 


where Ay, = (a* — 1). 
It suffices to prove that all the numbers A, are p-integers and 


Ak+tp-1 = Ak =0 (mod p). 


Indeed, if n is not divisible by p—1, then a” #1 (mod p) and so the equation 
By Ap B, . : : 

—= 7 shows that — is also a p-integer. But since a?~! = 1 (mod p) 
n an n 

we have 


3.5 Bernoulli polynomials 121 


Now we use the identity 


Set u =e’ —1. Then 


a 1 a 1 a 


et—] ef—-1 (it+u*—-l ui aut>d.bu® uw 


1 1 3 2 
=a eee = Crp . 
1+ >> —u r=0 

a 


b 
All the numbers c,. are p-integers since this is true of the numbers —. Therefore 
a 


ARE =) ler(e 1)", 
k=1 r=0 


where all the coefficients c, are p-integers. 
The function (e’ — 1)" can be represented as a linear combination with 


integer coefficients of the functions e”™, where m = 0,1,...,7r. In turn, 
oe l 
t 
emt _ : m— 
l! 
1=0 


Therefore A; can be represented as a linear combination of the numbers m*~+ 


with p-integer coefficients. The equality c,(e’ — 1)" = c,t” +--- shows that 
this linear combination contains finitely many summands. 

Since the sum and the product of p-integers is a p-integer, it follows that 
A, is also a p-integer. It is also clear that 


mE&-D+@-D) — mF-1 = mk-l(mP-1 —1)=0 (mod p). 


Therefore Aj4p,—1 — Ap is a linear combination with p-integer coefficients of 
rational numbers whose numerators are divisible by p. This means that 


Ar+p-1 — Ak =0 (mod p). 


Theorem 3.5.5 (von Staudt). Let n be even and p a prime. If n is not 
divisible by p—1, then By, is a p-integer and, if n is divisible by p—1, then 
pB, = —1 (mod p). 


Proof. By the preceding theorem if p is prime and n is not divisible by 
p—1, then the denominator of B,, is not divisible by p. Thus it remains to 
consider the case where n is divisible by p — 1. 
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love) tk ove) prtr— il: a —1 
By multiplying the identities 57 Bia = 7 and p> ; 
k=0 ! 
we obtain : , 
co n n+k—-1 p— P—t oo rts 
PP Brt rt t 
ey eee? ScieeD 2) Dera 
n,k=0 r=0 r=0 s=0 


Comparison of the coefficients of t” on each side shows that 


prti-kB pot n 


n+1 
k r 
GaSe 


Since By41 = Bn_-1 = 0 we obtain that 


n—-2 pr-* 

_ ya a ne 
2a 
k=0 r= 


n—k 


For n — k > 2, the number ao is, clearly, a p-integer. Induction on n 
n— 


shows that the numbers pB2,...,pB, are p-integers. (The start of the induc- 
tion: pBp = p, pB, = —tp.) 
n—k 
For n —k > 2, the number ——— is not only p-integer but also its 


n 
numerator after all possible simplifications is divisible by p, i.e., 


pr-k 
ae | =0 (mod p) 
Therefore 
p—l1l 
pBn = r” (mod p) 
r=1 


In the case considered, n is divisible by p— 1, and hence r” = 1 (mod p) for 
r=1,2,...,p—1. Asa result, we obtain 


pB, =—1 (mod p). 


We now define B,(t) = Bn(t) — Bn(0), and we recall that 
Ba(m) =n (19-1 + 27-1 4. + (m— 1-7) 
for all positive integers m. 


Theorem 3.5.6 (Almkvist-Meurman). For all positive integers h, k, and 
~ fh 
n, the number k” By, (7) is an integer. 


3.5 Bernoulli polynomials 


Proof. [Su2] The theorem on the addition of arguments 


By (a ale y) = > Bs(a)y"* 


can be expressed in the form 


n 


inlay) = ye: + Bry). 


s=0 


Therefore it suffices to prove the statement required for h = 1. 


For brevity, set 
~ (h 
weak” Be | = |e 
von Ba () 


Clearly, Bo(z) = Bo. Therefore 
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ae c t te” é t(e’* —1) 
do Pn = ee 2 Oe ei gi 


iL 
Set z= a and make the change x = kt. This gives 


\ ant” _ ka(e* —1) 


n! ekt — | 


and 


k’ a" xv 
ka z 

—j= d -1=) ss 
e = 7 and e 


s! 
r=1 s=1 


Comparing the coefficients of x” on the two sides in the first form we obtain 


foraln>1 


("7's ~kY! = (n +1)(1— an). 


(2) 
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In the second form we get a, = 1 and for n > 2 we get 


n—-1 
S- (") ar $n—l = —kan, (3) 
l=1 


where so =k and s, =1"+2™"+---+(k—-1)™. 
We now prove by induction that a, are integers, using (2) and (3). To do 
this, we need the following Lemma. 


Lemma. Let p be a prime, 2 <1 <r and (p,s) = 1. Then & ) is 
divisible by p™— "+1, 
Proof. Let us write | = tp*, where (t,p) = 1. Clearly, a < 1 —1 (the 


equality is only possible for 1 = p = 2). It is easy to verify that (”) = 
n 


= (" 7 ') . Therefore 


n\n-1 
sp" = sp" = 5 prt ae = ll = 5 yr4 N, 
i tp t tp* — 1 t 
where N is an integer and s and ¢ are relatively prime to p. Hence (a is 
divisible by p’~*. In turn, p’~@ is divisible by p"~'+! because a <1 —1. 


Since we have not specify k for some time, recall that 


nz [h 


First consider the case where k is a prime. Suppose that a1,...,@n—1 are 
integers. Then (2) and (3) imply that (n + l)a, and ka, are integers. If 
n+ 1 is not divisible by k, then a, is an integer. Now let n+ 1 = sk’, 
where r > 1 and (k,s) = 1. To establish that a, is an integer, it suffices to 
prove that (n+ l)a, = sk"a, is divisible by k”. Formula (2) shows that, in 
turn, it suffices to prove that the numbers ee k”—! are divisible by k” for 
1=1,2,...,sk" —2. Forl<n—r= sk" —1-—r, this is obvious. 

If sk” —r <1 < sk” —2, consider the number /’ = sk” —/ and apply the 

r ig 
Lemma to it. As a result, we see that (* ) = (*; ) is divisible by kr-U +1 
and hence ("*1)k"~! is divisible by k”—'4"-""41 = k” as was required. 

The case k = pit -...+p%™, where pi,...,Pm are distinct primes, does 
not involve any essentially new ideas. Suppose that a1,...,@,—1 are integers. 
Then (2) and (3) imply that (n+ 1)a, and ka, are integers. Let us express 
n+ 1 in the form 


n+1=p®-...-p>ms, where (s,p;) = 1. 
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If b; > 1, the same arguments as in the preceding case show that (n + 1)a, is 
divisible by p”. Set 
n+1 


= by, 
Pi 

Then (s;,p;) = 1 and all the numbers s;a, are integers. This means that the 
denominator of the rational number a,, is not divisible by primes p1,...,Dm. 
On the other hand, the number ka,, is integer, and hence the denominator of 
Gy, can only contain the prime factors of k. 


3.6 Problems to Chapter 3 


3.6.1 Symmetric polynomials 


3.1 Let o1,...,0 be elementary symmetric polynomials and let the numbers 
@1,-.--,@n € C satisfy the system of equations 
o4(Q1,---,4n) = On(ax,---,@%) fork =1,...,n 
Then a, = +--+: = Gn. 
In problems 3.2 — 3.4 we assume that « = (a1,...,%,) and y = 
(y1,;---;Yn), where all the numbers 2,...,%n,Y1,---,;Yn are positive. The 


sum «+ y is defined component-wise. 


3.2 [Ma5] For r = 2,...,n, the elementary symmetric polynomials o; satisfy 
the inequality 
onety) . orle) , only) 
Ora(ety) ~ or—a(z)  oraly)’ 


3.3 [Ma5] For r = 1,2,...,n, we have 


Vor(x+y) = Vor(x) + Vor(y). 
4 [Wh] Fix k and define the functions T,.(x) by the relation 


n 


I] (4+ a:t)* fork >0, 


> Tie 4.5" 
r=0 


I] -—ait)® fork <0. 


i=0 

In particular, if k = 1, then T,(x) = o,(x) is the elementary symmetric 
polynomial and if k = —1, then T,(a) = p,(a) is a complete homogeneous 
polynomial. 


a) Prove that if k > 0, then </T,.(a + y) > Ti 
b) Prove that if k <0, then </T,(a + y) < V/T,(x) + V/T;(y). 


=| 
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3.6.2 Integer-valued polynomials 
3.5 Ifa polynomial f(z) of degree n takes integer values at x = 0, 1, 4, 9,...,n?, 
then it takes integer values at all x = m?, where m € N. 
3.6 Let m and n be positive integers. Prove that the following conditions 
are equivalent. 
a) There exist integers ao,...,@,, such that 


GCD(ao,.-.,@n,m) = 1 
and the values of the polynomial a,2” + ay_12"~!+---+ ao at all x € Z are 
divisible by m. 
! 
b) — €Z. 
m 


3.6.3 Chebyshev polynomials 


3.7 For n > 2, the discriminant of the Chebyshev polynomial T;, is equal to 
gird) 7? 


3.8 a) If n > 3 is odd, the Chebyshev polynomial T,, is reducible. 
b) If n 4 2", T;, is reducible. 
c) If n > 3 is odd, the polynomial T,,(«) /x is irreducible if and only if n is 


a prime. 


3.9 Let u(x) and v(2) be polynomials with real coefficients and let V1 — u? = 
uv 1 — «?. Prove that 


a) u'(a) = +nv(x), where n = deg u; 
b) u(x) = £T), (2). 


3.10 a) Prove that the function y = T,,(«) satisfies the differential equation 


(1—2?)y” — ay’ + ny? =0 


and any polynomial solution of this differential equation is of the form cT},, 
where c is a constant. 
b) Prove that the function y = T;,(x) satisfies the differential equation 


(1—2°)(y’)? =n*(1—y"), 


and this equation has only two polynomial solutions, namely, y = +T,,(«). 


3.11 Let A,(x) be the determinant of the n x n matrix with the diagonal 
elements (a,...,), and elements (1, $,...,4) in the first super-diagonal, and 
elements (3,...,4) in the first sub-diagonal, the other elements being zero, 
ie., aij = 0 for |i — j| > 1. Prove that T,,(x) = 2"-1A,(z). 

3.12 [Dal] Recall that the matrix |la;;|| is called a circulant if aj; = b; — 
b;, where b, = 6; for k =1 (mod n). Let A,(x) be the determinant of the 
circulant matrix with bo = 1, b) = —2a, bo = 1. Prove that 


An(t) = 2(1—T,(z)). 
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3.7 Solution of selected problems 


3.2. For r = 2, the desired inequality follows from the identity 


ox(t+y) o2(x) a2(y) i=l j=l j=l 


o(@ty) ox) oily) = 2oi(x)oi(yoi(a@t+y) © 


Now suppose that r > 2 and the desired inequality is already proved for 
r — 1. Consider the system of numbers %; = (a1,...,%j-1, Vit1,---,Un). It is 
easy to verify that 


de tier—1(@i) =ro,(z), (3.1) 
LiOr—1(Xi) + op (Xi) = o-(a). (3.2) 


The sum of equations (2) for i= 1,...,n gives 
S > tidr_1 (Bi) + S| or (B:) = no, (2). 
i=1 i=1 
Subtracting equation (1) from this equation we obtain 
do ori) = (n — r)or(2). (3) 
i=1 


It is also clear that 


0, (x) — o,-(@i) = U¢op—1(Xi) = Lioy—1(x) — £7 Op—2 (Zi). 


The sum of these equalities for i = 1,...,n, with (3) taken into account, gives 
n n 
ro,(x) = S- LjOr—1(L) — a XL; Op—2(Zi), 
t=1 w=1 
ie., 
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Let us write similar identities for y and «+ y. The inequality required follows 
from the fact that, if x;, y;, a;, bj, cg are positive integers such that c; > a; +, 
then 


a} ee Ui (witty)? Sai i, Ye (wit yi? 
Zita, yita wityita ~ rita yitar xwtyitaitd; 
(ajax; —biyi)? Sh) 

(xitai)(yitbi)(tityitaitbi) ~~ 


3.3. We use Problem 3.2 and the following auxiliary statement. 


Lemma. [f a),...,@;,61,...,b, are non-negative numbers then 
\/ (a1 + by) +++ ++ (Gp + bp) > Vareeee: Gee 2185 bp 
Proof. Let 


Re) = 4 2igs6 08 SER” | ey ..2 de 1, > Oh. 


The inequality between the arithmetic and geometric means gives 


s : O21 Hea eb Op zy 
Q,°+...°d, = min ——————_.. 
zER(z) r 
Therefore 
: . +b t+...t +b, 
t/(ay +61) +...+(@- +b-) = min Tg sae aa a hee Ome > 
z€R(z) cr 
~ az t... + OpZy ~ byey te. . + bp Zp 
> min ———— + min ——— > 
zE€R(z) r ze R(z) r 
> Way: ... Ap t+ Vdy-... + bp. 


Let us express o,(a + y) as the product 


o,(@+y) Or-1(a@ + y) oi(a+y) 
Ori(aty) opo(a@+y) 1 


By Problem 3.2 


ont+y) . onl) | _ox(y) 
on-1(a@ + y) ~ oK-1(@) — oK-1(y) 
Now use the Lemma to obtain 
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3.4. We only consider the case k < 0 involving complete homogeneous 


polynomials. Let 1 = —k > 0. In this case we have an integral representation 
for the I’-function: 


T(1) = feteas 
0 


For a > 0, we can make a change t = as and get 


i.e., 


Set a; = 1—.2;t. For small values of |t|, the number a; is positive, and therefore 


n 


[]a-ai)* = (a 5) fof town \dsjas+ dan, 


i=1 
where 
81. Sn pt (21 site +2nSn I-1 
f(s1,---,5n) =e 7? ef(wisi )(s1 +...+ Sn) : 
Since 
Cour , r 
et(t1sit-+anSn) _ . t (2181 ape ee In Sn) 
r! : 
r=0 
we obtain 


(a8) + +++ + @nSn)"p(S1,---, $n) d81 +++ dSn, 


4 

& 

I 

=| 
oS 
| _ 
al 
are 
aie. 


where (81,...,8n) = e7 817778" (81+...+8,)'"1. The required inequality now 


follows from the Minkowski ear tes 


where g and h are non-negative on [a, b]. 

3.11. Simple calculations show that T,(2) = 2"~1A,(x) for n = 1 and 
n = 2. For n > 2, expanding the determinant A,,(7) with respect to the last 
row, we obtain 
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1 
An4i(2) = tA, (x) — qAan-1(2). 
This relation corresponds to the recurrence formula 
Tn4i(£) = 22T,, (x) — Ty-1(2). 


3.12. Let f(t) = cotat+:+-+ent” and let €1,...,é, be distinct nth roots 
of unity. The determinant of the circulant matrix with elements a;; = cj; is 


equal to f(€1)- f(€2)-...+ f(én). Indeed, e.g., for n = 3, we obtain 
111 Co C2 C1 fQ) f(1) f(1) 
lever | |a ae | =| sO) erfe:) fer) | = 
lez eg) \e2c1 co f(1) exf(€2) e3f (Ea) 
111 
= fer): F(e2) +... Fen) [Ler ef 
1 €2 €2 
111 
Since the determinant of the matrix | 1 1 e7 | does not vanish, we can 
1 €> €3 


divide by it. As a result we obtain the identity required. For n > 3, the 


arguments are similar. 
In our case, f(t) = 1 — 2at + t?, and therefore 


A, (x) = | [ (1 — 2x, + €). 


k=1 


In other words, we have to prove that 
n 
2(1 — cosny) = Il (1 — 2e, cosp + €2). 
k=1 


First, we prove that 


= 2khr 
2(1-— = 2" 1- — : 
(1 — cos ny) Ul ( cos (v + i )) 


This identity follows from the fact that 


— 2x" cosny + 1 = (2” — exp(iny)) (a" — exp(—iny)) = 


Fi (--ome(oo)) Feo. ace) 


k=1 k=1 
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Then, for z = 1, we get the identity required. 
Let us prove now that 


n n 


2k 
2” II (1 cos (++ *r)) = [[@ - 2cosvex + €2), 


k=1 k=1 

Qkri 
where €, = exp (=), Clearly, 
n 


2k 
(a — Ex)(x — €_~) = 2? — 2x cos (=) as 
nr 


2k 
Hence e7 + 1 = 2€, cos (=). and therefore 
nm 


[[a — 2cospe, + eZ) = Ii QE (cos (=) - cosy) = 
Tr 
k=1 


k=1 


It remains to observe that the last expression is equal to 


be kr = 2Qkr 
g2n s, g —)=3" i = —— . 
[sin (+=) II ( cos (e+ = )) 


k=1 


2 


sin (24) =~sin(2- 7") fork =1,...,.n-—1 
n 2 n 


and 
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Certain Properties of Polynomials 


4.1 Polynomials with prescribed values 


4.1.1 Lagrange’s interpolation polynomial 


Let 21,...,2n41 be distinct points in the complex plane C. Then there exists 
precisely one polynomial P(a) of degree not greater than n which takes a 
prescribed value a; at x;. Indeed, the uniqueness of P follows from the fact 
that the difference of two such polynomials vanishes at points 71,...,%n+41 
and at the same time has degree not greater than n. The following polynomial 
clearly possesses all the necessary properties: 


n+1 
= . (@—21)-...+(@ — UR_-1) + (@ — Bayi) -+... + (@ — Bn41) 
P= 2 pe ahi) ose = eld * ee — Wee? a = ee) 7 


n+1 w(x) 
7 2 “* = ap)w" (rR) 


where 
w(x) = (@—21)-...+(@ — B41). 
The polynomial P(x) is called Lagrange’s interpolation polynomial and the 
points 21,...,%n+41 are called the interpolation nodes. 
If a, = f(a), where f is a given function, then P is called Lagrange’s 
interpolation polynomial for f. 


Theorem 4.1.1. Let f € C"**[a,b] and P Lagrange’s interpolation polyno- 
mial for f with nodes «1,...,Un41 € [a,b]. Then 


M 
_ < e 
mes | P@)-1@)|= Gay Oe 


|o(x) 


? 


f= (n+1) 
where M max |f (x)|. 
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Proof. It suffices to verify that, for any point 2 € [a,}], there exists a 
point € € [a,b] such that 


_ FN 
f (x0) — P(xo) = fair 
For 29 = x;, where 1 <i <n, this equality is obvious. We therefore assume 
that 29 # x;. Consider the function 


u(x) = f(a) — P(a) — Aw(x), 


where X is a constant. Since w(x) # 0 this constant can be selected so that 
u(xo) = 0. It is also clear that u(a1) = +++ = u(an4i) = 0. The function u(x) 
has at least n + 2 zeros on the interval [a,b]. Hence u/(a) has at least n+ 1 
zeros on this interval and u‘*)(x) has at least n + 2 —k zeros. Fork =n-+1, 
we see that 

ule) = fe) — (B+ 1A 


pen) 
vanishes at a point € € [a,b]. This means that \ = (4 iy!’ ie., 
(n+1) 
F(a) ~ Plo) = FO (a0 


For a fixed interval [a,b] and a fixed degree n, the estimate given by The- 
orem 4.1.1 is optimal if w(x) is a monic polynomial of degree n with the least 
deviation from zero on [a, 6]. Since w is fixed, this is a condition on its roots, 
i.e., the interpolation nodes. 

For example, if [a,b] = [—1, 1], then we should have w(x) = 3¢Tn+1(2), 
where T,,,1(x) is a Chebyshev polynomial. Recall that 


Tn41(t) = cos((n+1)arccosz) for ¢ <1. 
The roots of T;,41 are 


(2k — 1)r 
— —__—_—_ = las Hi 
Lk = COS Bn +1)” pee VT 


For such nodes, the interpolation polynomial is 


w(t) Ti (2) 
(w@—ap)w"(w) — (@— te)Th 4 (a) 
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and therefore we have to prove that 
(n+1)(—1)*-} 
Va ae 
Since T,,41(x) = cos(n + 1)y, where x = cosy, we obtain 
T= (n+ Dsintnt l)y 
sin y 


If cosy = zp, then siny = \/1 — 2? and sin(n + 1)y = (-1)*"} 
In addition to Chebyshev interpolation nodes, one also makes use of the 
nodes uniformly distributed on a segment of the circle. For the nodes 


Tile) = 


Qnik 
n= en (=), where k = 1,...,n +1, 


the interpolation polynomial is 


n+1 
Poo) = eq Laster) — 
To prove this formula, it suffices to observe that 
d n nm = 
Ka" 1) = (n+ Lek = (n+ 1291 
L=Lj, 


The interpolation polynomial for the nodes 7, = a+ (k — 1)h, where 


k=1,...,n+ 1, can be expressed in the form 
Pa70e— oe ee. 
A” f(a) («& —a)(a—a—h)-...:(@-—a—(n—1)h) 
hr n! ; 
where 


This polynomial P(x) is called Newton’s interpolation polynomial. It is easy 
to verify that P(x,) = f(a). Indeed, 


P(a) = f(a), 
P(at+h) = f(a) + Af(a), 
(a+ 2h) = f(a) + 2Af(a) + A’ f(a), 


m 


P(a+mh) =O ("*)A¥ f(a) = f(at+ mh). 


J=0 


The last identity follows from the fact that A* f(2+h) = A*t! f(x) +A? f(z). 
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4.1.2 Hermite’s interpolation polynomial 


Let 21,...,%, be distinct points of the complex plane C and aj,...,an, 
positive integers whose sum is equal to m+ 1. For each point 2;, let 
yy, i go be given numbers. Then there is a unique polynomial 


Hy,(a) of degree not greater than m such that 
Hm (ei) =o), Halts) = yes oy Hie (aa) = 9, 


fori =1,...,n. In other words, at x; the polynomial H,, has prescribed values 
for its derivatives up to order a; — 1 inclusive. This polynomial H,,, is called 
Hermite’s interpolation polynomial. 

The uniqueness of H,,,(x) is quite obvious. Indeed, if G(a) is the differ- 
ence of two Hermite interpolation polynomials, then deg G < m and G(z) is 
divisible by 

Q(a) = (@— 41) -...+(@— Bp). 

To construct Hermite’s interpolation polynomial, we have only to define 
polynomials yiz(x) (i = 1,...,n and k =0,1,...,a; — 1) such that 

1) deg vir < m; 

2) vix(a) is divisible by the polynomial ——~—_, 
(a = xj)™ 
by (aw — 23) for 7 # 4; 

3) the expansion of yix(x) as a power series in x — x; begins with 


i.e., Pik(%) is divisible 


1 m 
Ae —2a;)* + (x — ai)™. 


We then have 
Oy (27) = = GY (ay) = 0 for 5 #4, 


pep (xi) = 1 
and 
go) (a) = 0 for0<Il<a;—-1,1Fk. 


Thus we can define 


n aj—l 
k 
Hm(x) =) 7 wt gie(2)- 
i=1 k=0 
— L(@-a)" . 
To define the yix, we note that the function Oa) is regular at 
! x 


z;, and therefore in a neighborhood of x; it can be expanded into the Taylor 
series 


_ — = » Qiks (& _ xi)? = lin (a) +4 + irs (x = ai). 
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Here [;;,(a) is a polynomial of degree not greater than a; — k — 1 which is the 
initial part of the Taylor series. It is not difficult to verify that the polynomial 


Lan ee 


Yir(x) = (@— 2)" 


possesses all the properties required. Properties 1) and 2) are obvious, and 
property 3) is proved as follows: 
= lin (w)(e—ai)" = 
vin(®) = Big ayrala—a) FE 
(2— ey (1+ b(2 — 2; jk. s\s 


Looking at the explicit form of the initial part of the Taylor series 1;,(x), we 
obtain 


n ajy—-laj— 


(s) (x) 


L=2; (a _ gi )ee—k-8 ° 


k=0 = s=0 


4.1.3 The polynomial with prescribed values at the zeros of its 
derivative 


In 1956, Andrushkiw [An3] announced a statement that 


for any n complex numbers aj,...,@,, there exists a monic polynomial P 
of degree n+ 1 which takes the values aj,...,@,, at the zeros of its derivative, 
aa 


The first published proof of this statement was given, however, only 9 years 
later by René Thom [Th]. We give the proof due to Yan Mycielski [My]. 


Theorem 4.1.2. For any given numbers a1,...,dn € C, there exist numbers 
bi,...,6n € C and a polynomial P(x) = x™*1+ pia" +--++ pnx such that 
P(b;) = =a; and P'(b;) =0 fori =1,...,n. Moreover, if a number 3 occurs k 
times in the sequence bj,...,bn, then ‘P(a) — P(f) is divisible by (a — 8)**1. 


Proof. For b = (bi,...,bn) € C”, we define 
P,(x) = (n+1) [ (Te -#9) dt. 
5 \i=i 
Clearly, P,(0) = 0 and P,(x) is a monic polynomial of degree n + 1. Further, 
Pi(x) = (n+ 1)(@ — b1)-...+(@ — dn), 


and hence P;(b;) = 0. If @ occurs exactly k times in the sequence b1,...,0n, 
then Pj(8) =---= pi) (3) = 0. Observe that any number ( is a root of the 
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polynomial P,(x) — P,(8) and (P,(x) — P,(B))' = P(x). Hence P,(x) — P,(8) 
is divisible by (2 — 3)**1. 
It remains to prove that the map 


y:C" —=C", y(b) = (Pp(b1),..-, Pe(On)) 


is surjective. First we prove that y is a local homeomorphism at any point 
b = (bi,...,bn) such that b; # 6; for i A j and b, -...-b, # O. For this, 


it suffices to verify that det (22a (bi ») # 0. seas on the contrary that 


ak (SS) = 0. Then there are numbers ci,...,¢n, not all zero, such that 
” Py (b; 
Se PRO 0 fer fet 0 
Ob; 
g=1 


It is easy to verify that 


a (n +1) f(T dt 
0 1 


n 
(for 7 = j, there appears an extra summand (n+1) [] (b;—bs), but it vanishes). 
=1 


Hence (1) can be expressed in the form 


=} Salve) Het 4 ot, 


9 Wel oA 


The integrand is a polynomial in ¢ of degree not greater than n which takes 
the value c; [[ (¢— 0s) at t = b;. By hypothesis, [] (t— bs) # 0 and c; 4 0 for 
s4j sdj 

some j. Hence F(a) is a nonzero polynomial of degree not greater than n+ 1. 
On the other hand, F(x) = 0 at « = 0,b1,..., bn: a contradiction. 

The map y : C” — C” induces a map gy : CP” — CP” given by the 
formula 

P((bo Sa bn)) = (o5°° : P,(b1) 2.02: Pi(bra)) 


The point is that 
(Ab) = Xt! y(b) and y~'(0) = 0. 
The first of these properties is proved by the change of variable rT = At: 


' The notation (bo : bi : --- : bn), standard in projective geometry, means the 
n-tuple (bo, b1,...,6n) defined up to a nonzero factor. 
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b Xb 
Jr: ,(t — b;)) dt = J (iO t-—b)A* dr = 
oT — Nbi)) dr 


The second property is proved as follows. Let y(bi,...,bn) = (0,...,0). Sup- 
pose that the sequence b),...,b, consists of kj numbers 3,, kz numbers (2, 
and so on, k,, numbers G,, (here 6; 4 3; for i # 7). Then the polynomial 
P,(x) = P,(x) — Py (G;) is of degree n = ky +--++km and is divisible by x and 
by (a — 6,)*1+1....- (2 — Bm)*™t1. This is only possible if b} = --- = b, = 0. 

Let A be the set of points (bp : b1 : +++ : bn) € CP” whose coordinates 
satisfy one of the following equations: 


The restriction of ¢ to CP” \ A is a local homeomorphism. 
Moreover, (A) C A. Indeed, if bo = 0, then 


p( (bp 2 bit 2020p) = (02 Bibi) o.0.2 Ga). 


If 6; = 0 for i > 1, then P,(b;) = 0 and if b; = b; for 1 <i <j <n, then 
P,(b;) = Py(b;). 

The image of CP” under ¢ is a compact set; in particular, it is closed. 
The image of CP” \ A under ¢ is an open set whose boundary belongs to 
p(A) Cc A. The set A does not divide CP” since it is of real codimension 2. 
Hence, ¢(CP” \ A) > CP” \ A and the closure of ¢(CP” \ A) coincides with 
CP, 


Remark. One should not think that @(CP” \ A) Cc CP" \ A. This is false, 
as the example y(1,2,3) = (—9, —8, —9) shows. 


4.2 The height of a polynomial and other norms 


4.2.1 Gauss’s lemma 


Let K be a field, and let x +> |z|, € R be a function on K. This function is 
called an absolute value if it satisfies the following conditions: 
(1) |a|, > O and |z|, =0 = «=0; 
(2) |xylo = [tlolyle: 
(3) a+ yl < lly + Ive. 
If instead of (3) a stronger condition 
(3!) |e + yly < max {|zlv,lylo} 
holds, then this absolute value is said to be a non-Archimedean. 
For the ring of rationals Q, there is a natural absolute value, namely |z|, = 
|x| (the conventional absolute value of x). This absolute value is Archimedean. 
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But there are also so-called p-adic absolute values defined for every prime p 
as follows. Let us represent a rational x in the form « = p"“, where m and n 
are integers not divisible by p. Set 
1 
|z|p = p 
It is easy to verify that this absolute value is non-Archimedean. Indeed, let 


c=p'™ andy=p 


ny 


sme 
na 


1 
be such that r < s. Then max {|z|p, |y|p} = — and 
Pp 


myn: + p> "men 
a+y= p 1217 Pp 21 
ny 1nN2 
1 eee Peer é 
Hence |x + y|p < — (the strict inequality is only possible for s =r). 
Pp . 
The height of the polynomial f(x) = >> a,x" relative to a given absolute 
value |-|, is the number H(f) = max |a;|,. We will be interested in estimates of 
uv 
the heights of the product of polynomials in terms of the heights of the factors. 
The simplest estimate is obtained if the absolute value is non-Archimedean. 


Lemma. Let H(f) be the height of the polynomial f with respect to a 
non-Archimedean absolute value |-|. Then H( fg) = H(f)H(g). 


Proof. Let f(x) = an2"+--++ao and g(x) = bpav'™ +---+bo. Among the 
coefficients a,,...,@ 9 consider those with the maximal absolute value (there 
can be several such coefficients) and select among them the coefficient a,. 
and with the greatest index r. Similarly select the coefficient b; of maximal 
absolute value with the greatest index s. 

Clearly, 


(fg)(z) = f(x)g(x) = Cn4m2"t™ +++++¢0, where cy = S- ajb;. 
i+j=k 


Since | -| is a non-Archimedean absolute value, we deduce that 


|< i|- [B;|}. 
len] < max {lasl-[b;1} 


Hence, 


lcn| < |ar|-|bs| if kO>rts, 
Cr+s =a,rbs(1+a), where |a| <1, 
Ice] <|ar|-|bs| if k<rt+s 


For non-Archimedean absolute values, |a| < 1 implies that |1+ a] = 1. To 
see this, we note that 


|1 + a] < max {1, |a]} =1 and 1 = |1+a—a| < max{|1+ a], al}, 


and hence, 1 < |1 + al. 
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Therefore |c,+s;| = |a,|-|bs| and the absolute values of the remaining 
coefficients cy, do not exceed |a,-| - |b;|. Hence H(fg) = |cr+s| = |ar| - |bs| = 
A(f)H(9). 


It goes without saying that Gauss formulated and proved his lemma using 
simpler language. That is, he proved that 


the greatest common divisor of the coefficients of the product of the poly- 
nomials f and g is equal to the product of the greatest common divisor of the 
coefficients of f and the greatest common divisor of the coefficients of g. 


We can arrive at Gauss’s formulation as follows. For a p-adic absolute 
value, H(f) = p~", where r is the greatest power of p which divides the 
coefficients of f. The identity H(fg) = H(f)H(g) means that, if p enters the 
greatest common divisors of the coefficients of f and g with powers r and s, 
respectively, then it enters the greatest common divisor of the coefficients of 
fg with power r+ s. 

For the polynomial f(21,...,%,) = >> te; ae -...:a' in n variables, 
the height is similarly defined as 


A(f) = max|aj,..i,|- 


To prove Gauss’s lemma for polynomials in n variables, we use of the so- 
called Kronecker’s substitution. Let d = deg f + deg g+1. To the polynomial 
h(a1,---,2n) = >> Cky..k, 2" +... 2%", we assign the polynomial 


(Sah) (y) = h(y, y4, 2 yt’) = x; Chey .d5,, ORE TERA TE Reet i 


If deg h < d, then the nonzero coefficients of h and Sqh are the same, and hence 
H(h) = H(Sah). Moreover, Sa(fg) = Sa(f)Sa(g). Hence, Gauss’s lemma for 
polynomials in n variables follows from Gauss’s lemma for one variable. 


4.2.2 Polynomials in one variable 
For Archimedean absolute values, the height 
A(f) = max |a;| 


does not possess the multiplicativity property H(fg) = H(f)H(g). But it is 
exactly for the conventional absolute value that the estimate of the height 
of polynomials is most interesting. Such estimates are needed for the theory 
of transcendental numbers. The estimate of the height of polynomials was 
obtained by A. O. Gelfond [Gel] as a by-product of the solution of Hilbert’s 
seventh problem: 


“Ifa #0, 1 is an algebraic number and 0 is an irrational algebraic number, 
then a? is a transcendental number.” 
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Later, K. Mahler[Ma2], [Ma3] found a simplified proof of Gelfond’s esti- 
mates. 

To estimate the height of the polynomial f(x) = aq(~—a4)-...-(a@— aa) 
Mahler used the quantity 


o] 


d 
M(f) = |aal [ [ max {1, Jail} , 
i=l 


which is now called Mahler’s measure of f. Clearly, Mahler’s measure is 
multiplicative: 


M(fg) = M(f)M(Q). 


Therefore the upper and lower bounds for M(f) in terms of H(f) enable us 
to estimate H(fg) in terms of H(f) and H(q). 


Theorem 4.2.1. Let deg f =d. Then 


Te HUN) <2 MP) 
Proof. Let us start with a simpler inequality, H(f) < 24~!M(f). Clearly, 
d 
|aa| - |i: Qin +--+ Oix| S [ ] max {1, lail} = M(/). 
i=1 
Hence 
jar < (7) MU). (1 


Starting from the formula ( = Gay + (") it is easy to prove by induction 


on d that (“) < 24—! for d > 1. Together with (1) this proves the inequality 
required. 

The proof of the inequality M(f) < Vd+1H(f) is based on Jensen’s 
formula in the next lemma. 


) 


Lemma. /f a function f(z) is holomorphic in the disk |z| <1 and has 
Z€TOS 21,-.-,2n (multiplicities counted) inside this disk, then 


20 ” 
= [inl fle*)| de = In| F00) — 57 in|zsl. 
v. k=1 


Proof. Consider the auxiliary function 
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where wz(z) = . It is easy to verify that wz conformally maps the unit 


p= 

disk into itself. Indeed, “if |z | = 1, then | wa (2 |’ = 1, and, further, wx (z,) = 0. 

The function f;, has no zeros inside the unit disk since the zeros z1,...,2n 
z). 


of the numerator f(z) cancel the zeros of the denominator w1(z)-...- Wn( 
Therefore, by the mean value theorem for the harmonic function In| fi z)| = 
Re(In fi(z)), we obtain 


20 

1 . 

= lacey dp = In| f1(0)]. 
0 


But | fi(e**)| = |f(e*)| and In| f1(0)| = In| f (0)| — xn [Z| 


Corollary. Let f be a polynomial. Then 


M(f)= exp f mlf(e™) dt. (2) 
0 


Proof. Both sides of (2) are multiplicative with respect to f. Therefore it 
suffices to consider the case f(a”) = %— a. For |a| > 1, the function has no 
zeros inside the unit circle and, for |a| < 1, it has only the one zero a inside 
the unit circle. Therefore, by Jensen’s formula, 


finj(e=*)| dt= 4 fin] f(e)| dy = 
0 0 
= In| f(0)| — en |a| = (1 - €) nal, 


where ¢ = 0 for ja] > 1 and e = 1 for Ja| < 1. 
On the other hand, M(f) = max {1, |a|} = |a|+~*. 


Armed with the formula (2), we can tackle the proof that M(f) < 
Vd+1H(f). Clearly, 


1 1 
(2 . 
/ Is ape2*™it| at = iD age? Ont dt = SO Jag? 
0 0 


Also, the convexity of the function exp implies that 


1 


exp | u(t) dt < [enw dt 
0 


0 


for any function u(t). Taking u(t) = 2In| f(e?*"’)|, we obtain 
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= Gy, lax|2 < Vd+1 max |az| = Vd +14 (f). 


Using Theorem 4.2.1 we can obtain the following estimates for H(fg) in 
terms of H(f) and H(q). 


Theorem 4.2.2. Let d; = deg f and dz = degg be such that d, < dg. Then 


A(f)H(9) 
pth /G tat = H(fg) < (1+ d:)A(f)A(g). 


Proof. We prove that H(fg) < (1+ di)H(f)H(g) directly, without ap- 
pealing to Jensen’s formula. To do this, let 


f(z) = Yraia*, ge) = b;07, fala) = Drees. 
Then 


\ce| = |aobe + a1bgp—-1 + +++ + Ga, bea, | < 
< (1 + d)) max |a;| max |b;| = (1 + di) (f)H(g). 


To prove that 
H(f)H(g) < 204° Vd) + do — 1 (fa), 
we use Theorem 4.2.1. This gives 


A(f) <2%"'M(f), H(g) <2”%7'M(g) and Vd, +d, +1M(fg) < A(fg). 


It remains to observe that M(f)M(g) = M(fqg). 


With Mahler’s measure M(f) we can also estimate the length L(f) of a 
polynomial f, defined by 


Indeed, on the one hand, the inequality (1) on page 142 gives 


d d 


1) = Dla s MU) DO (f) = 2M. (3) 


k=0 k=0 
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On the other hand, 


|F(e™*)| = |S ane?™#| < 7 Jal = LCF). 
Therefore : 
M(f) <exp f inL(f)dt = L(J). (4) 
10) 


By combining (3) and (4) we obtain 
L(f)L(g) < 2% M(f)2%M(g) < 29+ L(fg). 
The upper estimate for L(fg) is of the form 


L(fg) < L(f)L(9). 


This follows immediately from the definition of the length of the polynomial: 


L(F9) $Y laid - sl = (SO lal) (SO esl) = LLG): 


4.2.3 The maximum of the absolute value and S. Bernstein’s 

inequality 

Initially, to estimate the height of a polynomial, Gelfond made use of max| f(z) | : 
z|=1 

He proved the following statement. 


Theorem 4.2.3. Let dj =deg f, dog =degg andd=d,+dp2. Then 


max|f(z)|- max|9(z)| < 2°* max| fo(2)|. 


Proof. Without loss of generality we may assume that 


max|f(2)| = max|9(2)| = 1. 


1 
Suppose that max| f9(2) < 7ad° Then, for k = 0,1,...,d, one of the numbers 
z\|=1 


|f(ex)| and |g(ex) 2nik 


either | f(ex)| < s for d; +1 values of the index k or |g(ex)| < s for dz +1 
values of the index k. To be definite, let 
{€0,--+,€a} = {Aioy.c0 yey) Ri easy 


and | f(a1)| < dy for 1=0,1,..., dh. 
By Lagrange’s interpolation formula, we have 


1 
’, does not exceed Th Therefore 


, where €, = exp ( 
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t=) Fo 


ay — a0) +... + (ay — ay—-1)(Qy — A141) +--+ (QL — Ma, ) 


Let us multiply the numerator and denominator of the /-th summand in this 


formula by (az — (1) +...+ (az — Ba). As a result, the denominator becomes 
equal to 

ia (x — €9)(a — €1)-...+ (@ — Eq) = gatl _ L 

LQ] 0 Ol} ra, L— AY] 


d+l_ 1 the last limit is equal to the derivative of e¢+!—1 


Since q; is a root of x 
at ay, ie., to (d+ lat. 
If |z| = 1, the numerator obtained consists of d factors and the absolute 


value of each of them does not exceed 2. Hence, if |z| = 1, we have 


d od _ d+ 


Ol s + yrt a = 


<1, 


which contradicts the hypothesis that max| f (z)| =1. 
zj=1 


Remark. For polynomials with coefficients in F = C or in R, sharper 
estimates can be obtained in the form 


m 


I] (max|fe(2)|) < Cr(m,n) max|f(2)], 
k=1 


|z|=1 


where f = fi-...: fm, n = deg f and Cr(m,n) is a constant (see [Bo]) which 
is defined as follows. Let 


0 
1(6) = / In(2 cos(5)) dt. 
0 


Co(m,n) = (o<0( 27 (@))) 


and Cr(m,n) = Cc(2,n). Both estimates are precise. 


Then 


The norm 


Il = max| f(2) 
of the polynomial f satisfies the following inequality. 
Theorem 4.2.4 (S. Bernstein). Let deg f =n. Then || f’|\| < nlf. 


Proof. [O] We need the following Lemma. 
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Lemma. Let deg f < n and let z1,...,2Zn be the roots of the polynomial 
z™ +1. Then, for any t € C, we have 


tz) — f(t 
Proof. Set g:(z) = eae It is easy to verify that g:(1) = tf’(t) and 
z— 
gt is a polynomial in z of degree not higher than n—1. Lagrange’s interpolation 
formula with nodes at z1,...,2n shows that 


= Yo Zk) oot : = Fa 


1 
(We used the fact that z?~' = ——.) 
Zk 


For z = 1, we obtain 


: 22k 1 f(tzx) — f) 
tf'(#) = Zh = = 
f(t) mn 2K JE =e (zp — 1)? 
Lig Qz th— 27, 
ar feog ai - a=o 
k=1 . k= WF 
To calculate the sum > ceaNee we consider the choice f(t) = t”. Then 
k=1 
f(tz~) = —tn, and hence 
P= at 221, 
nm = Fa : (zk _ 1)?’ 
1:€.; 
n Dix: 2 
S312" 2 (1) 
ka “* 


22k 
| — 1? 


Jia 


22 
Let us show that ——“— is a negative real number. Indeed, z, = e’* £ 1, 


(ze — 1)? 


and hence 


22k _ Qe’? = 2 _ 1 <0 
(ze —-1)2 (et —1)2 ev —-2+e-%#” cosp—1 
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Hence (1) implies that 


and the theorem now follows. 


4.2.4 Polynomials in several variables 


Mahler’s measure also helps to estimate the height H(F) = max |ag,...z,,| of 
the polynomial F(21,...,2n) = Slap,..n,0¢'-...- ak» in n variables (see 
[Ma3]). 

Mahler’s measure of a polynomial in one variable can be obtained by either 
of the two equivalent formulas: 


d 

M(f) = |aal [[ max {1, |ail}, 
i=l 
1 


M(f)= exp f nlf") dt. 
0 
For polynomials in n variables, only the second definition is relevant: 


1 1 


M(F) = exp | Se fl Carer 


0 0 


dt;--- | dtm. — (*) 


We recall that on page 142, we obtained the inequality 


jal s (f) mn) 


for polynomials in one variable. 
Using this inequality we can prove the following inequality for polynomials 


in n variables: ; 
1 n 
io | teat 
Jratel S (Qi) ee (G2) MUP), (1 


where d,...,d, are the degrees of the polynomial F' with respect to 71,...,%n, 
respectively. To do this, we express F' in the form 


PGisad 25 Bn) = ‘2 Fi, (2, ..., 2m) 0%. 


For fixed 72 = Q2, ..., Ln = An, we have the estimate 
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| Fie, (@2,---,0n)| < (1) ma ), 


where g(x) = F(z, a2,...,Qn). 
Set x = e271 ag = e272. a, = €?7'n, and then take the logarithms 
of the inequality obtained: 


dy 1 ; 
la (ese) | <n _) +f In | (ee a 
0 


Let us integrate both parts of this inequality over tz2,...,t, from 0 to 1, and 
then take the exponent. The definition of Mahler’s measure (*) directly implies 
that as a result we obtain M(F,,) < (2) M(F). 

Next, we express Fj,, (#2,...,%n) in the form 


Fel Boscia yy) S > Fru ky (3, ++ + Bn) 0k? 
ko=1 


We similarly prove that 


M(Fixks) < (12) M(Fi,) < (i) (2) Me), 


and so on. It is also clear that M(ax,...k,,) = @ky...ky- 
As we have already mentioned, it is easy to prove by induction on d that 
({) < 2%! for d > 1. Hence (1) implies that, if d; >0,...,dn > 0, then 


H(F) < 2 tdatt+dn—" (FE), (2) 
If, in reality, the polynomial F(x1,...,2n,) depends only on v(F’) indetermi- 
nates, whereas the remaining n — v(F’) indeterminates enter with degree 0, 
then instead of (2) we obtain a rougher estimate 


H(F) < 2h tht--+dn-“(F) (FP), (3) 


The estimate of H(F’) from below is proved in exactly the same way as on 
page 143 for polynomials in one variable. This estimate is of the form 


M(F)< Vd, +1-...-Vd,+1H(F). (4) 
Let F\, ..., Fs; be polynomials in x1, ..., % and let dj, ..., din be the 


degrees of the polynomial F'(/) with respect to these variables; let v(F;) be the 
number of variables on which F; actually depends (i.e., the number of indices 
j for which dj; > 0). Then by (3) 


][44 2 [eer = gaitdat+dn—-v(F) M4 (F), 
l=1 al 
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Now, using (4), we obtain 


H(F\)+...+H(Fs) < 20 tet tan /dy $1... dn +1 AF), 


where we assume that v(F’) = n, i.e., F actually depends on all the variables 
1,..-,%,. The upper estimate 


H(F) < qutdat--+dn F(R)...» ACF) 


can be proved without appealing to Mahler’s measure. Indeed, the number of 
nonzero coefficients of F; does not exceed 


(l4+dy)+...>(l+din) < Quitdet tain | 


Therefore any coefficient of F is the sum of not more than 24@+4+'"+4n prod- 
ucts of the coefficients of the polynomials F,..., F's. 
Using Mahler’s measure, we can also estimate the length of the polynomial 


L(F) = }7 leks... 
Summing the inequalities (1) we obtain 


L(P) < gavrdateetdn MCP), (5) 


For the polynomial F(21,...,¢%,) = (1+21)-...-(1+4n)%, inequality (5) 
becomes an equality, and so the estimate is precise. 
The estimate of L(F’) from below is simply obtained: from the obvious 
inequality 
[Fee igs") | LF) 


we derive 
M(F) < L(F). (6) 
For the polynomial F(21,...,2%n) = af -...- x4, inequality (6) becomes an 


equality, and so the estimate is again precise. 
From (5) it follows that 


[[£ = II (atthe tar tae A) = gditdet-+dn MF), 
l=1 l=1 


Then by (6) we obtain 
DAP ic 2 DUE SDP oat), 


The estimate from above 


is obvious. 
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4.2.5 An inequality for a pair of relatively prime polynomials 


Let f(a) and g(x) be relatively prime polynomials over C. Then 


m(x) = max {| f(x)|,|9(@)|} > 0 


and m(a) — oo as ~ > oo. Therefore the quantity 
E(f,g) = min m(x) 


is positive. In the theory of transcendental numbers one sometimes needs to 
estimate E(f,g) from below. N. I. Feldman suggested a method to obtain 
a precise estimate. We give an exposition of his method following Mahler’s 
paper [Mad]. 


Theorem 4.2.5. Let aj,...,Q@m be the roots of a polynomial f, and let 
B1,..-, Bn be the roots of a polynomial g. Then 


B(f,g) > min { a. aa 8 “6 


Proof. Fix an arbitrary number x € C, and let a = min|x — a,| and 
j 
8 =min|x — §;|. Then a = |x — a,| and 6 = |x — ()| for some k and I. Since 
a 

f and g are relatively prime, it follows that one of the numbers a and (3 is 
positive. 

Suppose that a < @. First, consider the case where a > 0. We show that 
in this case 


lox — Bil = 
|z — Bi| = (2) 
for any i. Indeed, if |a, — 3;| < 2a, then 
je — Bi| > BZ a> ees 


If, however, |a, — (;| > 2a = 2\a — ax|, Then again 


|x — | = |(w — ax) + (ax — Bi)| > la — an| + lax — Bi = ss 
Let g(a) = bo(a — G1) -...+(@ — By). Inequality (2) implies that 
|9(x)| = |bo(a — fr) +--+ (@— Bn)| > pork 2 pion) 


Second, we deal with a = 0, i.e., x = ag. Then the inequality |9(x)| > Jo(ou)| 


is obviously satisfied. 
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| o(ax)| 
7: 


Thus, if a < 6, then |9(z)| > We similarly prove that if a > £, 


then |f(«)| > Bee . Thus in either case 


ma) > min { LED, ld, 
~ d, 


k am Qn 


Remark. If f(a”) = (a—1)™ and g(x) = (a +1)”, inequality (1) becomes 
an equality. Hence (1) is precise. 


4.2.6 Mignotte’s inequality 


In this chapter we have already considered several inequalities for estimating 
the coefficients of the factors of a given polynomial. An estimate of this type 
follows from the next theorem due to Mignotte [Mil]. 


Theorem 4.2.6. Let f(x) = agp + aya +--+ +Gm2™ and g(x) = bo + bia + 
--++b,2” be polynomials with integer coefficients. If f is divisible by g, then 


n—1 n=l 

A os 

bil< ("> )IfI+ (871) laml 
fll = a0 +--+ ain. 


Proof. Together with f(x) = am [](# — a:), we consider a polynomial 


f(a) = am II (a — aj) II (ax — 1). 


ja;|>1 jai|<1 


where 


We prove first that Il = ||f]||, where the norm If of f is defined as for f. 
This follows immediately from the next lemma. 


Lemma 4.2.7. Let h(x) =cotcaxt:::+cpr*, let hi(x) =(x—a)h(x) and 
ho(x) = (@x — 1)h(x). Then ||hi|| = ||hal]. 


Proof. Clearly, 

[laa ll? = S2 lesa — acl? = 
= x (lea)? + |eve;|? = 2 Re(aciG-1)) = 
= S- (jaci_1|? + le;|? _ 2 Re(aciG—1)) = 
= 0 Jaci_1 — cl? = |\hell?. 
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The coefficient of the highest term a” of fis GQm [|] i, and the coeffi- 
Jag|<1 
cient of the lowest term is ta,, [|] @; (we assume that the empty product 
Jag|>1 
(without any factors) is equal to 1). Set 


M(f)= [[ aw m(f) 


l 
| 


Jag|>1 Jai|<1 
Then 
FI? = FI? = lor? (Mf)? + m(f)?) . 
ae Ll 
M oe 1 
(N< 7S (1) 
Also 


Ja] = [On| |S oe +--+ ing] Slee YBa Bing 2) 


where 3; = max {1, |a;|}. Clearly, [] 6; = M(f). 
Now we need one more lemma. 


Lemma 4.2.8. Let 7, >1,...,%m >1 and 7-...-%m = M. Then 
m—1 m—1 
OD tetas (Py) (7). 
W1<e<dE 


Proof. We may assume that 2, < @2 <--+ << 2m. Let us replace the pair 
{@m—1,Um} by {1, %m—1%m}. As a result, the sum considered will increase 
by o(@m—1 — 1)(am — 1), where o is the sum of products xj, +--+ + Ui,_,, 
1 < ty < +++ < ip_y < m—2. Therefore, if x, —; > 1, the sum considered 
strictly increases. Hence it will be minimal when 7, = --- = %m_1 = 1, 
Lm — M. In this case, the sum consists of Ca) terms equal to M and ('", ‘) 
terms equal to 1. 


Applying Lemma 4.2.8 to the collection (,,...,@m, we see that 


m—1 m—1 


Now, if we take into account that aera = ie and it ce (3) we 


can rewrite (2) in the form 


nisi (5!) 04022) 


We similarly prove that 


al < onl ((" >" )aeta)-+ ("—1)). (3) 
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All the roots of g are the roots of f, and hence M(g) < M(f). Further, 
|bn| < |@m| since by hypothesis g divides f. Using these inequalities and (1), 
we can reduce (3) to the form required: 


n—-1 n—-1 
pals (“> ")i+ (%— t) lant 


Corollary. If f, g and f are polynomials with integer coefficients, then 
g 


an\ V2 
Nol SC") As, where n= deg, 


Proof. Clearly, |an| < || f||, and hence 


pls ("5") + (821) ) nan= (“It 


Therefore ||g||? < 5> (")"| f ||’. It remains to verify the combinatorial identity 
j=0 

Oa) 

j=o \J ay 


To do this, it suffices to compare the coefficients of t” in both sides of the 
identity 


(1+¢)"(1+4)" =(1+1t)™. 


4.3 Equations for polynomials 


4.3.1 Diophantine equations for polynomials 
Mason’s theorem and its corollaries 


In the proof of the insolvability of various Diophantine equations for polyno- 
mials the following statement is rather effective. 


Theorem 4.3.1 (Mason). Let a(x), b(x) and c(x) be pairwise relatively 
prime polynomials such thata+b+c=0. Then the degree of each of these 
polynomials does not exceed no(abc)—1, where no(P) is the number of distinct 
roots of the polynomial P. 


b 
Proof. [La4] Set f = © and g = -. Then f and g are rational functions 
c c 


/ 


which satisfy f +g+1 = 0. Differentiating this equality we see that f’ = —g’. 
Hence, 
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g 
f 


The rational functions f and g are of a particular form: 


b 
a 


[[@ — pi)", where r; € Z. 


For the function R(«) = [](# — pi)", we have 


Let 


Therefore, after multiplication by the polynomial 


No = [] (w - a4)(2 — 8;)(@ — 4) 


/ / 
of degree no(abc), the rational functions £ and £ become polynomials of 


degree not greater than no(abc) — 1. Then, since a(x) and b(x) are relatively 
prime and 


the degree of each of the polynomials a(x) and b(x) does not exceed no(abc)—1. 
For c(x), the proof is similar. 


Theorem 4.3.1 has several interesting corollaries which we formulate as the 
following Theorems 4.3.2 — 4.3.4. 


Theorem 4.3.2 (Davenport). Let f and g be relatively prime polynomials 
of nonzero degree. Then 


1 
deg(f* — 9?) > 5 des f +1. 
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Proof. If deg f? 4 deg g?, then 
deg( f° — 9?) > deg f° = Bdeg f > 5 deg f +1. 
Thus, we may assume that deg f? = deg g? = 6k. 


Now consider the polynomials F = f3, G = g? and H = F-—G= f3— gq’. 
Clearly, deg H < 6k. By Theorem 4.3.1 


max{deg F, deg G, deg H} < no(FGH) — 1 < deg f + degg + deg H — 1, 


i.e., 
6k < 2k+3k+ deg H —1. 


Hence deg H >k+1= + deg f +1. 


Remark. For the polynomials 
fpH=P +2, g@) = 4+ 38, 
Davenport’s inequality becomes an equality. 


Theorem 4.3.3. Let f, g and h be relatively prime polynomials, at least one 
of them not being a constant. Then the identity 


cannot hold for n > 3. 


Proof. On the assumption that the identity holds, the degree of each of 
the polynomials f”, g” and h” does not exceed 


deg f + degg +degh—1, 


by Theorem 4.3.1. Adding up these three inequalities we obtain 


n(deg f + deg g + deg h) < 3(deg f + deg g + degh — 1). 


Hence n < 3. 


The Diaphantine equation f* + g% = h? for polynomials f,g,h has an 
obvious solution if one of the numbers a, (3, y is equal to 1. Therefore in what 
follows we assume that a, 3,7 > 2. 


Theorem 4.3.4. Let a, 3,7 be positive integers and2<a< §<y¥. Then 
the equation 
fet+g? =hi 


has relatively prime solutions only for the following collections (a, 3,7): 


(2,2,7), (2,3,3), (2,3,4), (2,3,5). 
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Proof. Let a, b and c be the degrees of f, g and h respectively. Then by 
Theorem 4.3.1 


aa<a+b+c-l, 
Bb<a+b+c-l, (4.2) 
ye<at+b+c-l1. 


Hence 
a(a+b+c) <aa+t Bb+ yc < 3(a+b+4+c) —3, 


giving a < 3. By hypothesis a > 2, and hence a = 2. For a = 2, inequality 
(1) becomes 
a<b+e-1. (4) 


Adding together inequalities (4), (2) and (3) we obtain 
Bb+yce< 3(b+c)+a-3. 
Since 8 < y and applying (4) once again, we obtain 
B(b+c) < 4(b+c) —4, 


giving 0 < 4. Hence @ = 2 or 3. 
It remains to prove that if 6 = 3, then y < 5. For 8 = 3, inequality (2) 
becomes 
2b<a+c-l1. (5) 


Adding together (4) and (5) we obtain 
b<2c— 2. 


Then (4) implies that 
a<3c—3. 


The last two inequalities and (3) imply that 
ye < 6c — 6, 


and so y <5. 

The polynomials satisfying the relation f* + g? = h7 are closely related 
to regular polyhedra. Felix Klein described this relation in detail in the book 
[KF], where a method of constructing these polynomials is also given. Let us 
recall the final result. 

The case a = 0 = 2, y = nis related to a degenerate regular polyhedron, 
namely the planar n-gon. The relation in question is 


fe aE mn eh —_— 
2 2 7 
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The case a = 2, 0 = 3, y = 3 is related to a regular tetrahedron. The 
relation is 


1213 (2° — x)? + (a4 — 21/3 a? +.:1)8 = (24 + 21/3 2? 4-1). 


The case a = 2, 8 = 3, y = 4 is related to a cube and a regular octahedron. 
The relation is 


(al? — 3308 — 332% +1)? + 108(a2° — x)* = (2® + 142* + 1%. 


The case a 2, 8 3, 7 5 is related to a dodecahedron and an 
icosahedron. The relation is 


T? + h? = 1728f°, 
where 


T = 29 +1+4522(a7° — 2*) — 10005(27° + 2), 
H = —(2”° +1) + 228(2* — 2°) — 494219, 
f =2x(z + 112° - 1). 


Theorem 4.3.4 was proved by H. Schwarz [Sc7]. The solution of Diophan- 
tine equations of a more general type 


‘ieee g? = [HRY 
is given in [Ev]. 


Theorem 4.3.5 ([Na]). Let x(t) and y(t) be rational functions; let m > 2 
and n > 2. Then the equation 


has solutions only form =n = 2. 


h 
Proof. Let us express x and y in the form xz = f and y = % where f 


and g are relatively prime polynomials and h and & are also relatively prime 
polynomials. Then the equation considered takes the form 


Pk” _ h™g™ = g°k™. (6) 


Since f and g are relatively prime, it follows that if g(a) = 0, then f(a) = 0. 
In this case (6) implies that k(a) = 0. Similarly, if k(a) = 0, then g(a) = 0 
Hence g(t) = [](t — a;)% and k(t) = [](t — a)’, where a;, b; > 1. 

The multiplicity of a;, as a root of the polynomials fk", h"g™ and 
gk", is equal to nbj;, ma; and nb; + maj, respectively. If nb; 4 ma;, then 
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the multiplicity of the root a; of the polynomial fk" — h"g™ is strictly less 
than nb; + ma;. Hence nb; = maj, i.e., k” = g™. 
After division by k” = g™, equation (6) becomes 


By Theorem 4.3.4 only two possibilities can occur: {m, n} = {2, 2} or {2, 3}. 
But in the second case k” = g™ = 1°, where 1 is a polynomial, and the equation 
f? — h? = 1° has no solutions. 


Waring’s problem for polynomials 


The classical Waring’s problem is as follows: 


given a positive integer n, find the minimal number k = k(n) for which 
any positive integer m can be represented in the form m = mi +---+ mf, 
where ™m1,..., Mx are non-negative integers. 


Several generalizations of this problem for polynomials are known. Here 
by Waring’s problem for polynomials, we will mean the following problem: 


given a positive integer n, find the minimal number k = k(n) for which 
any polynomial g € C[z] can be represented in the form g = fi'+---+ fp, 
where f; € C[{a]. 

To solve Waring’s problem, it suffices to confine ourselves to the case 
g(x) = a. Indeed, if x = f?(a)+---+ fP(x) and h(x) is an arbitrary polyno- 
mial, then h(x) = ff (h(z)) +---+ ff (h(a). 

The identity («+ ay —(«- 1)? = x shows that k(2) = 2. 

Theorem 4.3.6 ([Ne]). [fn > 3, then 

a) k(n) > 3; 

b) k(n) <n < k?(n) — k(n). 


Proof. a) Suppose on the contrary that 


n 


a= fi(e)+ #@) = [[(A+eh), 


r=) 


where ¢€ is a primitive n-th root of unity. All the factors f; + €” fo, except 
one, are constants. For n > 3, there are at least two such factors. Therefore 
fi t+ af2 = a and f, + bfo = GB, where a 4 b and a, b, a, B € C. Hence, 
fi, fe € C which is impossible. 

b) For any f(a), set 


Af(z) = f(x+1) — f(z) and A? f = A(A?“'f) for p > 2. 


It is easy to verify that deg(Af) = deg f — 1, and so deg A”“1(x”) = 1, ice., 
A”—1(z") = ax +b. On the other hand, the definition directly implies that 
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Ala") = (2g +n—1)* + ex(a +n —2)" +--+ epee”. 


Indeed, for example, 


A? (a®) = ((@ + 2)° — (a +: 1)8) — ((a@ +1) — 2°). 


Making a change of variables 71 = ax + b we get a representation 
t= fi (ti) +--+ + fr (21) 


which shows that k(n) <n. 

Next, we prove the inequality n < k?(n)—k(n). Consider the representation 
x= ff(x)+---+ f(x), where f; € C[z], with the minimal k. 

Recall that the Wronski determinant, or Wronskian, W(gi,...,9%) of the 


functions gi(x),...,9x(x) is the determinant of the matrix 
gi (x) Gx(2) 
/ / 


my feed 
gi @) 0. gf @) 
Consider two Wronski determinants, 
W= Wii ia ssady) and W2 = WIG Ia scese dp) 


By hypothesis, « = fj'+---+ fj’, and so the first column of W2 is obtained 
from the first column of W; by adding a linear combination of the other 
columns of W,. Hence W; = Wo. 

If the functions gi, 92,...,g% are linearly dependent, then W(g1,..., 9%) 
vanishes identically. The converse statement is false. For example, if g1(x) = x? 
and go(x) = 2|z|, then 


x? ala 


W (91,92) = on 2\2| = 0, 


but the functions g; and gz are linearly independent in any interval (—a, a). 
It is known that if W(g1,..., 9%) vanishes identically for x € (a, b), then 
there exists a subinterval (a, 3) C (a, 6) on which the functions gi,..., 9% 
are linearly dependent (for a simple proof of this statement, see [Kr2]). In 
particular, for polynomials g1,...,g9,, the fact that W(g1,..., 9%) identically 
vanishes implies that these polynomials are linearly dependent. 

Since the representation « = f/'(x)+---+ f7(z) is minimal, it follows that 
the functions f/',..., fj are linearly independent, and so W(f/’, f3’,..., f7’) 
is a nonzero polynomial. 

The r-th derivative of f/’ is divisible by fj’~", and so the i-th column of 
the Wronskian is divisible by : alle Therefore W(f]',..., ff) is divisible by 


k 
II f?-**?. In particular, 
i=1 
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k 
deg W(f?,.-., fg) = (n—k+1) > deg fi. (1) 


i=l 


On the other hand, we now prove that 


: k(k — 
deg WCF. fit) <n Dodeg f, - AE 


1=2 


42%, (2) 


First, we recall that W(f?,..., ff) =W(a, f2,..., f7). If we multiply the j- 
th row of the determinant'W (2, f2,..., f7) by 27—' we obtain a determinant 
for which all the nonzero elements in the i-th column (for i > 2) are of degree 
n deg fi. Hence 


k 
deg W(ff,.... fg) <1+n)_ deg fi -1-2-----(k-1) = 


i=2 


k 
k(k-—1 
=n dog f— MY 
1=2 


Comparing (1) with (2) we see that 


s : k(k — 1) 
—k+1 is = Ok 
(n + ) D des f <n) deg J . 
ie., 
k(k — 1) 

ndeg fi < (k — 1) D dee fi _ 7 1. 

We may assume that f; is the polynomial of the highest degree. Then 
k(k-1 
ndeg fi < k(k — 1) deg fi — wee +1< k(k—1) deg fi. 


The last inequality follows from the fact that 1 — $k(k — 1) < 0 for k > 3. 
After division by deg f; we obtain n < k(k —1). 


1 By abuse of the language we mean here the matrix whose determinant — a 
number without rows or columns — we consider. 
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4.3.2 Functional equations for polynomials 
Functional equations that determine polynomials 


Every polynomial f of degree n + 1 satisfies the identity 
f(x) = fy) +(e -y)f'y)+.-+(e-y)" 
Here f("* is a constant, hence, 


(n+1) 
(e— yy FP = 


ay pint) aft) (m 
= (x _ y)” ( of) _ c) + (x _ y)” ( cere : a c) . 


Therefore the functional equation 


n 


f(x) = So (@—y)*on(y) + (@ — y)"A(z) 


k=0 
has a solution of one of the following form: 


a) f is a polynomial of degree not higher than n + 1; 


(k) 
b) ge(y) = pi) for k=0,1,...,n—1; 


k! 
g™ firt)) 
c) In(y) = “w) ~ ae —& 
ef) 


f(y) 
(n+1)! ° 


Theorem 4.3.7 ([Cr]). Let f,90,.--,9n,h : R — R be arbitrary functions 
which satisfy (1) for any x,y € R such that « # y. Then these functions are 


of one of the above forms a)—d). 


Proof. For y = 0 and y = 1, equation (1) takes the form 


f(a) = 3 cye® + 2”h(x) for x # 0, 
k=0 


f(x) = y dyx® + (a —1)"A(x) for « #1. 
k=0 


Hence 
n 


>» (dy — ey)" 


_ k=0 
Me) ee 
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1 


This equality holds for x 4 0,1 and, ifn is even, we should also exclude x = 5. 


Fix y = 2 and y = 4. We similarly obtain the equality 


> (fe — ex)a* 
A 
©) = Gray (ea 
which holds for « 4 2,3,4. As a result, we see that h € C™(R) but then (2) 
and (3) imply that f € C™(R). 
Differentiating (1) n times with respect to 7, we see that 


f(a) = nlga(y) +S ((e— )"A(2)). 


For a fixed x, this equality implies that g,(y) is a polynomial. Now we may 
differentiate (1) with respect to x not n but n— 1 times and similarly deduce 
that gn—1(y) is a polynomial, and so on. In particular, go,..., gn € C™(R). 

Next, we differentiate (1) with respect to y and set y = 0. As a result we 
obtain 


=-¥ ka*—1g,,(0) + +> ka*gi,(0) — na” h(a). 


This equality implies that «”~'h(z) is a polynomial of degree not greater than 
n. If we differentiate (1) n times with respect to y and set y = 0 we can show 
that h(x) is a polynomial (of degree also not higher than n). But if h(x) is a 
polynomial and x"~th() is a polynomial of degree not higher than n, then 
h(a) = ax +b. 

Since h(a) = ax + b and f € C™(R), it follows that (2) implies that f(z) 
is a polynomial of degree no higher than n + 1. Therefore 


n+1 Pw) 


ee 


On the other hand, replacing « by x + y we can rewrite (1) as 


ak. 


f(aty)= > ge(y) +2" (ax + ay + bd). 


This means that 


(k) 
ox(y) = for k=0,1,...,.n—-1, 


_ FM) perry). 
ar ie —ay—b and cee = 
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Corollary 1. Let f © C"(R) and suppose that 


2) — yo ew" pw) 
19- EERE IM  porae so 
(o-y)" (n +1)! 


for any x,y © R such that x # y. Then f is a polynomial of degree no higher 
than n. 


Corollary 2. ({Ha]) Jf 
f(x) ~ gly) _ ele) +l) (4) 
uy 2 
for any x,y € R such that x 4 y, then f is a polynomial of degree no higher 
than 2, andg=f andy= ff’. 
The functional equation 
f(x) ~ g(y) a+y 
a 5 
= ¥\—> (5) 


also reduces to the functional equation (4). Indeed, if (5) holds for any x,y € R 
such that x ¥ y, then 


g( See) p(x) + ely) 
2 2 : 
To prove this, in (5) we replace x by «+ y and y by a — y. Then we obtain 
fle+y) —glt—y 
+H =He- 9) _ yyy 
y 
for any x,y € R such that y £ 0. Now replacing y by —y we obtain 
fle-—y)-—glety 
(=v) oe) _ 04) 


—2y 
Hence 
f(utu+y)—gutv—y) =2yy(utv), 
f(u-vty)—glu—v—y) = 2yp(u— v) 
and 
f(utut+y)—glu-—v—y) =2(u+y)p(u), 
f(u-—v+y)—glutvu—y) =—2(v — y)y(u) 
Therefore 


piu + v) + y(u—v) = 2y(u). 
Setting u+ v= az and u—v=y we obtain the identity required 


o(x) + ply) = 2p (=) 
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Polynomial solutions of the equation f(az + 3) = f(z) 


For a = +1, the polynomial solutions of the equation f(ax + 3) = f(a) are 
easy to find. If a =1 and f(x) = aor” + a,2"~1+--++ Gn, where ap # 0, we 
obtain the identity 


f(z) = aon” + aya") + +++ + an = ao(e + 8)” + ar(at By" +-++ +n. 


This identity is only possible when a; = aj + agnf, i.e., BG = 0. 
If a = —1, then the equation f(—a + 3) = f(x) reduces under the change 


g(a) =f lat . to the equation g(x) = g(—x) whose solutions are polyno- 


mials of the form 


ao x” + ag ee? 4 we + Gon. 


For an arbitrary a, comparison of the coefficients of the highest term of 
f(x) = apx” +--+ shows that a” = 1. Therefore, for a 4 +1, we see that 
n > 3, and we deal with this case in the next theorem. 


Theorem 4.3.8 ({Oz]). Let a polynomial f of degree n > 3 satisfy the rela- 
tion f(ax + B) = f(x), wherea A +1 anda” =1. Then 


B n 
f(x) = ao (+ +e. 
Proof. It suffices to consider polynomials of the form 
f(z) =2" + aya +--+ ap. 


We have to prove that a; = for 7 =1,...,n —1. We use induction 
J I= 


on 7. 
Comparison of the coefficients of ”~4 for f(x) and f(ax + 3) shows that 


(l1-a”"4)a; Fx Cae (1) 


s=0 J 


way 7 


For j = 1, we obtain (1 — a"~!)a, = (")a"~1. By the hypothesis a” = 1, 


and so 
_ (n a! _(n\ B 
WSN igri am 


The start of the induction is proved. 

To prove the inductive step, ie., the passage from 7 = k tog =k +1, 
we again use (1) and the condition a” = 1. By the induction hypothesis, 
i= (") wa for s=1,...,k. Hence 


ee _f{ n n—k—1 gk 2 ale nie 
d=—« ‘Navn = ("Je +>(")(,272,) (a —1)§ , 
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It j to verify that (")(_"—* lejos 
> = . e 
Tey MUSE Pe De Nn k+1 8 


(1 _ to mae (cS = 
= (ror ak (14 Ya ++ (cag): 


It is also clear that the expression in the square brackets is equal to 


ion 1 k+1 1 k+1 7 qk+1 4 
a-1l a-—1 ~ (a —1)k+1" 


_ 1 n ey q” — qr—k-l 
Gk+1 = T_GnckoT Cu a eae 


Since a” = 1, we have 
: n Bkti 
Sl fel) Ged ee 


4.4 Transformations of polynomials 


Therefore 


4.4.1 Tchirnhaus’s transformation 


In 1683, in the Leipzig journal “Acta eruditorum”, E. V. von Tchirnhaus 
(1651-1708) published a method for the transformation of algebraic equa- 
tions which, he thought, allowed the solution by radicals of algebraic equa- 
tions of any degree. Leibniz immediately refuted Tchirnhaus’s claim on the 
omnipotence of his transformations. Moreover, it turned out that to solve 5th 
degree equations by means of Tchirnhaus’s transformations one had to solve 
an equation of degree 24 and rather complicated at that. 

Nevertheless, Tchirnhaus’s transformation has important applications. For 
example, with the help of it any equation of degree 5 without multiple roots 
can be reduced to the form y° + 5y = a solving in the process only equations 
of degree 2 and 3. 

Tchirnhaus’s transformation of the equation 


a+ ca") +--+ +e, =0 (*) 
consists of the following. Let 71,...,2, be the roots of this equation. Consider 
a rational function y that does not tend to infinity at points 71,...,2p. Set 


Yi = y(a;) and consider the equation 
R(y) = 0, where R(y) = y" + qy" +-++ + Gn (+*) 


whose roots are y1,-.-, Yn. We show in what follows that if (x) has no multiple 
roots, then the x; can be expressed in terms of the y;. Selecting an appropriate 
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function y we can make the coefficients q1,...,@,—1 vanish. Unfortunately, to 
do this one has to solve an equation of order (n — 1)! and this was precisely 
the circumstance that Leibniz pointed to. 

Without loss of generality we can take y to be a polynomial of degree not 
higher than n — 1 due to the following s tatement. 


Theorem 4.4.1. Let 21,...,%, be the roots of a polynomial f of degree n 
P 
and let p = ray where P and Q are polynomials such that Q(a;) 4 0 for all 
i=1,...,n. Then there exists a polynomial g of degree no higher than n — 1 
whose values at %1,...,2n coincide with the values of yp at these points. 
Proof. By hypothesis the polynomials f and Q have no common roots, so 
they are relatively prime. Hence there exist polynomials R and S such that 


1 
Rf + SQ =1. Since f(x;) = 0, it follows that S(a;) = on Hence 
Xi 


= P(x;) S(a;). 


Thus, for the required polynomial g we may take the remainder after division 
of PS by f. 


In what follows we assume that to equation (*) we apply the transforma- 
tion 
y = (2) = pot pi(e) +--+ prise”. 


Let us show in this case how we can calculate the coefficients of the polynomial 
R(y), see (**), with given roots y; = g(a), i= 1,...,n. 
To avoid cumbersome notations, we confine ourselves to the case n = 3. If 


x? = —c,x? — cox — cg, then 


yt = pot + pix” + po(—c1 2” — cox — c3) = py + pia + poz”. 
Similarly, yx? = pj + pix + pyx?, where the p! are linear functions in the pj. 


Therefore, if x; is a root of f and y; = g(x;), the system of equations 


(po —Yy)Zo+Piz1+pez =O), 
Pozo + (pi —y) a t+phz =, (1) 
poz + pizi1 + (py—y)z =0 


has a nonzero solution (20, 21, 22) = (1,21, x7). Set 
Po P1 P2 
A= | Po Pi Pb 
Po PY Ps 
Then det (A — yl) = 0 for y; = g(a;). If the polynomial det (A — yI) has no 
multiple roots, it coincides with the polynomial R(y) to be found. Since the 
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elements of A linearly depend on the p;, the coefficient gq, is a polynomial of 
degree k in the pj. 

If the polynomial R(y) has no multiple roots, the matrix A has no multiple 
eigenvalues. Therefore to every eigenvalue of A corresponds a unique, up to a 
factor, solution of (1). This means that the root y; of the initial polynomial is 
uniquely recovered from the root x; of the transformed polynomial, and each 
x; depends rationally on y;. 

Tchirnhaus’s transformation helps also to solve 3rd and 4th degree equa- 
tions by radicals. The cubic equation can be reduced to the form 


y® + q3 =0, 


solving a system consisting of a linear equation q; = 0 and a second degree 
equation gz = 0 depending on parameters po, pi, p2. For this, one needs to 
solve a quadratic equation. 

An arbitrary 4th degree equation can be reduced to the form 


y+ gay? + qa = 0. 


For this, one has to solve a system consisting of linear equation q; = 0 and a 
3rd degree equation g3 = 0 which reduces to solution of a cubic equation. 


4.4.2 5th degree equation in Bring’s form 
Any 5th degree equation can be reduced to the form 
y° + qay +95 = 0, 


solving a system of equations q; = q2 = q3 = 0. For this, one has to solve a 6th 
degree equation. A sharper analysis performed in 1789 by the Swedish math- 
ematician Bring, shows that in this case, instead of a 6th degree Equation, it 
suffices, actually, to solve equations of degree 2 and 3. To satisfy the condition 
qi = 0, we express one of the parameters po,..., 4 as a linear function of the 
remaining parameters. Then the coefficient q2 represents a quadratic form in 
four of the parameters p;. This quadratic form can be reduced to the basic 
form 

uy + ug — vy — v3, 
where u, and v; are linear functions in the p,; (for this, we have to take square 
roots). To satisfy the equation g2 = 0, it suffices to solve the system of linear 
equations u, = v1, U2 = Vg. After that there remain two parameters for which 
the equation q3 = 0 is of a 3rd degree. As a result, we get an equation of the 
form 

y? + aay + a5 = 0. 


If qa £ 0, a linear substitution reduces this equation to the form y® +5y = 
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Similarly, the equation 
a” tea t4.--+e,=0, n>5, (1) 
can be reduced to the form 
y” + gay" * + gsy"? + +++ + dn = 0 


by means of the transformation 


y =pot pit + pox” + p3x® + paz’. 


In the process we only have to solve equations of degree 2 and 3. 

Moreover, instead of a system qi = g2 = q3 = 0 we can solve the system 
qd = @= qm = 0, i.-e., at the last step, instead of the cubic equation g3 = 0, 
we solve the 4th degree equation q4 = 0. Then (1) will be reduced to the form 


y” + q3y" 9? + gy? +++ + Gn = 0. 


Making use of these transformations and the change of variable 2 +> x7! 


we can reduce the general 5th degree equation to any of the following forms 


x? +pxr+q=0, 


x + pe? +q=0, 
a + pa®+q=0, 
x +prt+q=0 


4.4.3 Representation of polynomials as sums of powers of linear 
functions 


The problem of representing polynomials as sums of powers of linear functions 
is simplest for the quadratic x? + 2ax + b. This problem is as follows: 
Represent the given quadratic in the form 


Au(a +01)? +--+ +Am(t + am)? 


and investigate what is the minimal number of basic linear functions « + 
Q1,..-,£+Qm necessary to perform this. 
There are two versions of this problem: 


1) The basic functions are the same for all quadratics. 
2) The basic functions depend on the given quadratic. 


In the first case the minimal m is 3. For the basic functions we can take 
any three distinct functions 7 + a,, © + a2, x + a3. Indeed, the system of 
equations for A, A2, A3 is 
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Ay trA2+A3 = 1, 


aA, apr a3A3 = a, 


afAy a r2 azA3 =b 


The system always has a solution since its determinant is a non-vanishing 
Vandermonde determinant. It is also clear that two functions x + a, and 
z+ ag are insufficient: A3 = 0 only if a, b, a; and ag are constrained by a 
relation a(a1 + a2) = b+ajaz. 

In the second case, the minimal m is 2. The required representation is, 
e.g., of the form 


v4 2a2+b= 5 (2 ba Vb~@) | 5 (2 ba vba). 


For polynomials of degree n, the problem of selecting the universal basic 
functions + Q1,...,2+Qm is solved exactly as for n = 2. Let us formulate 
the answer as a theorem (the proof for n > 2 is the same as for n = 2). 


Theorem 4.4.2. a) Ifai,...,Qn+41 are distinct numbers, then any polynomial 
of degree n can be represented in the form 


Ai(@ + 04)” +++ + Angi (@ + Onqi)”. 


b) If the numbers a1,...,Qm are such that any polynomial of degree n can 
be represented in the form 


thenm>n+1. 


It is not difficult to indicate a collection of universal basic linear forms 
for polynomials of degree n in m variables as well. For convenience, instead 
of a polynomial f(21,...,@m) of degree n, we will consider the homogeneous 
polynomial 


Ly x 
F (x0, %1,---;%m) = xo f (S....%). 
ZO ea) 


Theorem 4.4.3 ({[So]). Let ao,...,Qn be distinct numbers, and let 


4; = Xo fF As@, + A4T2 +++: +Ay%m, where Qs, Qt,---, Qu € {Q0,---, An}. 


Then the forms (Z;)" generate the linear space of all homogeneous polynomials 
of degree n inm-+1 variables. 


Remark. The number of forms (Z;)” is equal to (n+1)™ while the dimen- 


sion of the space of homogeneous polynomials of degree n in m+ 1 variables 


is equal to (""*"). 
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Proof. For simplicity, we consider the case m = 2. In this case we have 
to represent the polynomial 


p(z,Yy,2) = Si tegsg Fag? 


O<i+j<n 
in the form 
n+1 n+1 
P(x,y,z) = y Ast(x# + asy + a4z)”" = y Ast y Cnigo,e” Pye, 
s,t=1 s,t=1 O<i+j<n 
n! : : 
where ¢nij = Tif We obtain the system of equations 
ilgl(n —i— 9)! 
n+1 
Gig = Cnij ) Ast aay , 0 < a+ s n. 
s,t=1 


Let us complement this system with the equations 


n+1 : 
So Anokay =0 for it+j>n, 1<ij<n. 
s,t=1 
We then obtain a system of linear equations with the matrix V ® V, where 


V = |la‘ ||" is a Vandermonde matrix. For arbitrary square matrices A and B, 
of sizes a x a and b x b, respectively, it is easy to prove that 


det(A ® B) = (det A)? (det B)*, 


using their Jordan normal form. Hence det(V @ V) = (det vy(rta)? #0. 
For arbitrary m, we similarly obtain a system of linear equations whose 
determinant is det(V @ --- ® V) = (det V)t™, 


If for each polynomial we select its own basic linear functions, the problem 
becomes much more difficult. Adding the summand b;(x + (;)”" increases 
the number of variable parameters by 2. The coincidence of the number of 
parameters on which the polynomial of degree n depends with the number of 
parameters in the expression 


by (a + 91)” + +++ + de(x + By)” 


is only possible when n is odd. Then k = $(n +1). It turns out that indeed, a 
generic polynomial of odd degree n can be represented as the sum of $(n+ 1) 
summands of the form b(a+ 3)". But in certain degenerate cases several extra 
summands might become necessary. 


Example. The polynomial x? + 2? 


bi (a + 1)? + bo(x + B2)?. 


cannot be represented in the form 
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Proof. Clearly, 3, # G2 and 312 4 0. In this case, the conditions b1 3? + 
b233 = 0 and 6,3? + b23 = 0 imply that by = be = 0: a contradiction. 


Let us clarify, therefore, what the term “generic polynomial” does mean 
in the situation considered. We keep to the case n = 5 (for other odd n the 
arguments are similar). Let us express the polynomial f(a) of degree 5 in the 
form 

asa + 5asx* + 10agz° + 10aa? + 5a,x + ao. 


Let 


1 z 223 


ag G1 a2 43} _ 3 2 _ 
ae Gia tia = p32" + poz” + piz + po = p(2). 

a2 a3 A4 a5 
We say that f(a) is a generic polynomial if p(z) is a polynomial with precisely 
three distinct roots (in particular, ps 4 0). 


Theorem 4.4.4 (Sylvester). The generic polynomial of odd degree n can 
be represented as the sum 


bi(@ + G1)" +--+ + de (a + Br)”, 
where k= — us : 
2 
Proof. For n = 5, we have to solve the system of equations 
by G7 + 6183 + 6165 = ar, PH 0y losesiay 08 (1) 


For 31, 32 and (3 we take the roots of the equation p(z) = 0. By the hypothesis 
these roots are distinct. Hence, from the system (1) for r = 0,1, 2 we uniquely 
find b;, bz and b3. It remains to prove that, for the values b; and (; obtained, 
equations (1) are satisfied for r = 3, 4,5. 

From the definition of p(z) it follows that 


To V1 2 X3 
ag a, a2 a3 
a1 a2 a3 a4 
a2 43 A4 a5 


= p33 + pete + piri + poXo 


for any numbers 20,21, 272,23. Further, if we replace the row (0, #1, %2,¥3) 
by (@;, @i+1, @:+2, @i+3), Where i = 0,1,2, the determinant vanishes. For i = 0, 
we then have 

p3a3 + p2a2 + piai + poao = 0, 


—p3a3 = po > bi + pi >, iB; + po >, biG? = 
= S° bi(p +0 + pi Bj + p23?) = ~~ dips}. 


Since p3 4 0, we obtain (1) for r = 3. Taking i = 1 and 2 we similarly obtain 
(1) for r = 4 and 5. 


i.e., 
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4.5 Algebraic numbers 


4.5.1 Definition and main properties of algebraic numbers 


The number a € Cis called algebraic if it is a root of an irreducible polynomial 
with rational coefficients. If the highest coefficient of the polynomial is equal 
to 1 and the remaining coefficients are integers, then the number a is said 
to be an algebraic integer. To every algebraic number a, there corresponds 
a unique irreducible monic polynomial f. The roots of this polynomial are 
called numbers conjugate to a. 

It is not difficult to show that if a is a root of an arbitrary (i.e., not 
necessarily irreducible) monic polynomial with integer coefficients, then a is 
an algebraic integer. In other words, 


if a monic polynomial with integer coefficients is represented as the product 
of two monic polynomials with rational coefficients, then all these rational 
coefficients are integers. 


This statement is one of the possible formulations of Gauss’s lemma 
(Lemma 2.1.1 on page 49). 


Theorem 4.5.1. Let a and {3 be algebraic numbers, and let y(x,y) be an 
arbitrary polynomial with rational coefficients. Then y(a, 3) is an algebraic 
number. 


Proof. Let {a1,...,Q@n} and {(1,...,B,} be the sets of numbers conju- 
gate to a and (, respectively. Consider the polynomial 


i=1j=1 
The coefficients of this polynomial are symmetric functions in aj,...,@, and 
(B,..., 8m. Hence they are rational numbers. 


Remark. One can similarly prove that, if a and ( are algebraic integers 
and y(z,y) is a polynomial with integer coefficients, then y(a, 3) is an alge- 
braic integer. 


In particular, if @ and ( are algebraic numbers (integers), then so are a3 
and a+ @. Further, if a # 0 is an algebraic number, then a~! is also an 
algebraic number. Indeed, if a is a root of the polynomial )* azx* of degree 
n, then a! is a root of the polynomial > a,x"~*. But if a is an algebraic 
integer, a ' is not necessarily an algebraic integer. 

Therefore the algebraic numbers constitute a field and the algebraic inte- 
gers constitute a ring. 
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Theorem 4.5.2. Let a and ( be algebraic numbers constrained by the relation 
y(a, 8) =0, where y is a polynomial with rational coefficients. Then, for any 
number a; conjugate to a, there exists a number 3; conjugate to 2 and such 
that y(a;, 3;) = 0. 

Proof. Let {a1,...,Qn} and {(1,..., 2m} be the sets of all numbers con- 
jugate to a and (, respectively. Consider the polynomial 


m 


f(e) = [[ ole, 8). 


The coefficients of this polynomial are rational and f(a) = 0. Hence f(z) is 
divisible by [](a — a,), and therefore f(a;) = 0, ie., y(ai, 8;) = 0 for some 
q. 


Theorem 4.5.3. Let a be a root of the polynomial 
f(z) = 2" + Bp-12""! +++ + Bo, 
where [3o,...,Qn—1 are algebraic integers. Then a is an algebraic integer. 


Proof. Consider the polynomial 


F(z) = |] @* + Pras" +--+ + fon), 


where {@n—1i},---, {G01} are the sets of all the numbers conjugate to Gy-1, 
.., $0, respectively. It is easy to verify that the coefficients of F are integers. 
It is also clear that a is a root of F. 


The algebraic number a is called totally real if all its conjugates are real, 
in other words, if all the roots of the irreducible polynomial with root a are 
real. 


k 
Example. The number a = 2cos (=) is totally real. 
n 


k 
Indeed, a = ¢ + ¢~1, where ¢ = exp (=). Let a; be conjugate to a. 
nm 


Theorem 4.5.2 implies that a, = €, + ee where €; is conjugate to ¢«. The 
number € satisfies the equation x” — 1 = 0, and hence ¢€, is also its root, and 


l lk 
therefore ¢; = exp (=). In this case a1 = 2.cos (=) ER. 
n n 
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4.5.2 Kronecker’s theorem 
In 1857, Kronecker [Kr1] proved the following statement. 


Theorem 4.5.4 (Kronecker). a) Let a 4 0 be an algebraic integer. If a 
is not a root of unity, then at least one number conjugate to a has absolute 
value strictly greater than 1. 

b) Let G be a totally real algebraic integer. If 8 A 2cosrx, where r € Q, 
then at least one number conjugate to 9 has absolute value strictly greater 


than 2. 


Proof. a) Let {a1,..., Qn} be the set of all numbers conjugate to a. Sup- 
pose on the contrary that ja;| << 1,i=1,...,n, and consider the polynomial 


fala) = (@ —af)-...-(@-—o%) =a" + apn—12™) +--+ + ax. 


Since a is an algebraic integer, it follows that az -1,.-.,@z%,0 € Z. The con- 
ditions |a;| < 1 fori = 1,...,n imply that |az,.| < (“). Therefore the coef- 
ficients of the polynomials f;, fo,... assume only finitely many values, and 
hence, among these polynomials, there are only finitely many distinct ones. 
But then the set of roots of these polynomials is also finite, and all the numbers 


a,a’,a?,... are in this set. Therefore 


a? = a! for some p,q € N and p# q. 


Since a ¥ 0, it follows that a?~4 = 1. 

b) Let {(1,..., Gn} be the set of all numbers conjugate to 3. By hypothesis 
all of them are real. Suppose on the contrary that |3;| <2,7=1,...,n. Then 
the absolute values of all the numbers conjugate to 


are equal to 1. Indeed, the numbers a and £ satisfy a? — Ba +1 = 0. Hence, 


by Theorem 4.5.2, any number a; conjugate to a is a root of a polynomial of 
2 


the form as — Bia; +1=0. Since |f;| < 2, we have a <1, and therefore 


By Theorem 4.5.3 the number a is an algebraic integer. Therefore we can 
apply part a) to it. As a result, we see that a = e’™’, where r € Q. Therefore 


B=ata!=a4+d= 2cosrn, 


as was required. 


Let us give an interesting application of Kronecker’s theorem. 


176 4 Certain Properties of Polynomials 


Theorem 4.5.5 (Minkowski). Let A be a square matrix with integer ele- 
ments. Suppose that all the elements of the matrix A—TI, where I is the unit 
matriz, are divisible by an integer n > 2 but AAI. Then 

a) Ifn > 2, then A" £ I for all positive integers m; 

b) Ifn=2 and A? #1, then A™ #1 for all positive integers m. 


Proof. By hypothesis A = I +B, where B is a matrix with integer 
elements. In particular, all the eigenvalues of B are algebraic integers. The 
eigenvalues a of A and the eigenvalues 3 of B are related by the equation 
a=1+4+n. 

Suppose that A” = I. Then a™ = 1, and hence |a| = 1. Therefore 


|3| = =i, (1) 


n 
Inequality (1) is strict except for the case when n = 2 and |a — 1| = 2, ie., 
a=-l. 

For n > 2, inequality (1) is strict. In this case the absolute value of the 
algebraic integer @ and of all its conjugates are strictly less than 1. Hence 
(GB =0, and therefore a = 1. 

The identity A™ = J can only hold if all the Jordan blocks of A are of size 
1 x 1. If all the eigenvalues of A in this case are equal to 1, then A = I. 

Now consider the case n = 2. In this case a = +1. Therefore the Jordan 
form of A is a diagonal matrix with elements +1 on the main diagonal. Hence 
A? = I which contradicts our hypothesis. 


Elementary but rather cumbersome estimates enable us to sharpen Kro- 
necker’s theorem as follows. 


Theorem 4.5.6 ([ScZ]). a) Let a £0 be an algebraic integer which is not 
a root of unity, and let {a1,...,Qn} be the set of all the conjugates of a. If 
2s of the numbers aj,...,Qn are real, then 

-2 


max |a;| >1+47° 
l<i<n 


b) Let 6 be a totally real algebraic integer such that 8 4 2cosrzm, r € Q; 
let {(1,..., Gn} be the set of all the conjugates of 8. Then 


max |§;| > 2+47-27"-3, 
l<i<n 


4.5.3 Liouville’s theorem 


Euler conjectured that not all numbers are algebraic but he could not prove 
this. The first to prove the existence of transcendental (i.e., not algebraic) 
numbers was Liouville in 1844. In 1874, Cantor showed that in a sense there 
are more transcendental numbers than there are algebraic ones, in the sence 
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that the set of algebraic numbers is countable whereas the set of all real (or 
complex) numbers is uncountable. 

Liouville’s proof is based on a relatively simple but important remark: 
every irrational algebraic number does not have too good an approximation 
by rationals. More precisely, the following statement holds. 


Theorem 4.5.7 (Liouville). Let a be a root of an irreducible polynomial 
f(x) = ant” +an_12"-1+--++ a9, where n > 2. Then there exists a number 
c>0 (depending only on a) such that 


Cc 
a- 2 —— (1) 
for any integer p and any positive integer q. 


Proof. If la - 2| > 1, then (1) holds for c = 1. Hence, we assume that 
la — 2| < 1. Let us express f(x) in the form f(x) = a, [] (# — a), where 
i=1 
a, =a. Then 


1(B)-m! 


x lan| - 


P 
-_— a; 


q 


< 


p n 
a4) TT = 


1=2 


n 
a-2).T] (al +1+ la) =a 
1=2 


e 
a-—-|, 
q 


where c is a positive number that depends only on |a,,| and a. 
Let us assume that ao,...,@,, are integers relatively prime to each other. 
Then the number |a,,| is completely determined by a. Moreover, the number 


P = 
q'f (3) = dnp” + dn—1p"~" +++ + agg” 


is an integer, and hence |q" f (4) | > 1. Therefore 


where c= c;' 


Theorem 4.5.8 (Liouville). The number a = >> 2~" is transcendental. 
k=0 


N 
Proof. For any integer N, consider the number a = > 27"! = . where 
k=0 
p is an integer and q = 2N'. We have 
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Oe ee, i 2.2 
~ Gg) aan ET onae t gevaayavesy 7°) S paver = GN 


Suppose that a is an algebraic number of degree n, i.e., a root of a poly- 
nomial of degree n with rational coefficients. Then by Theorem 4.5.7 


and therefore Qg-"—) seg, Le, 6. < 2g) = 2 gO -8-)), Bak 


Nim QN'(n-N-1) — Q which is clearly a contradiction when N is large 


enough. 


Inequality (1) can be expressed as 


c 
lqa — p| > Qt 
Set P(x) = qa — p. Then 
c 
|P(a)| > Fy? (2) 


where H = max {|p|, |q|} is the height of P. 
An inequality similar to (2) holds also for polynomials P of arbitrary 
degree. 


Theorem 4.5.9. Let a be an algebraic number of degree n. Then there exists 
a number c > 0 (depending only on a) such that, for any polynomial P of 
degree k; with integer coefficients, either P(a) = 0 or 


ck 


Hr-1l 2 


where H is the height of P (i.e., the greatest of the absolute values of the 
coefficients of P). 


|P(@)| > 


Proof. Let P(x) = apx* +---+a,2+ ao, where a; € Z, and P(a) # 0. 
For a positive integer r, the number { = ra is an algebraic integer. Define the 
polynomial @ by the relation 


Q(rx) = r* P(z). 


Clearly 
Q(y) = ie le) =agy® + rapiys | +--+ +r* ao 


is a polynomial with integer coefficients. Therefore the product Q((1)-...- 
Q(Gn), where 61,...,6, are all the numbers conjugate to 6, is a nonzero 


integer. Hence |Q(()| -|Q(S2)-...-Q(Bn)| > 1, ie., 
r®" | P(a)|-|P(az)-...-P(an)| > 1. (3) 
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On the other hand, 
|P(as)| < H (1+ las] +--+ lai|*) < (1 + lail)*. (4) 


Let h(a) = max {Jaj|,...,|a@n|}. Then (3) and (4) imply that 


|P(a)| oe Alge a h(a))")" ae 


e = thay) 


Choosing we get the inequality desired. 


For n = 2, Liouville’s theorem cannot be improved in the sense that the 
inequality 
a =| SS 
q| @ 
has infinitely many solutions. For n > 3, however, the estimate (1) was con- 
secutively sharpened by Thue, Siegel, Dyson, Gelfond, Schneider, Roth and 
others. For example, Roth [Ro4] proved the following statement. 


<< 


Theorem 4.5.10 (Roth). Let a be an irrational algebraic number and let 
6 be a positive number however small. Then the inequality 


p 1 
la < aes 


holds only for finitely many pairs p and q, where q (> 0) and p are integers. 


For the proof of Roth’s theorem, see, e.g., the book [Ca4]. 


4.6 Problems to Chapter 4 


4.1 Let 2,..., 2%, be the vertices of a regular n-gon, zo its center. Prove that, 


if P is a polynomial of degree no higher than n—1, then + }> P(z,) = P(z0). 
k=1 


4.2 Let P(x, y) be a polynomial such that 


P(a, y) = P(at+1,y+1). 


Prove that P(z, y) = >> ax(x — y)*. 
4.3 Let f(x) be a polynomial of degree n with only simple roots 71,...,2n. 
Prove that 
a bierkood 2 
a = 0 fork =0,1,...,n—2; 
) dy (ai) 
bs a, 
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4.4 Let P(z) be a polynomial of degree n and max |P(z)| < 1. Prove that, 


|z|=1 
if P(a) = 0, then 
P 1 
max (2) < = and max ) < i 
|z|=1|z —a 2 lzl<l]z—a 1+ |a| 


4.5 Letd=2?+ar+be Zz]. 

a) Prove that the equation p? —dq? = 1 has non-trivial solutions p, g € Z[z] 
in exactly the following cases: 

1) a is odd and 4b = a? — 1; 


2) a is even and b= (G) + lor b= ($) 2. 


b) Prove that the equation p? — dq? = —1 has non-trivial solutions p, q € 
2 
Za] if and only if a is even and b = (5) +1. 


4.6 Prove that there exists a unique, up to multiplication by —1, polynomial 
f(a) of degree n for which the function (a + 1) (f(x))” — 1 is odd. 


4.7 Let P,(x) be a polynomial of degree n over C. Prove that for n = 4, 6 
and 8 almost all polynomials P,,(x) can be represented in the following form: 


Py = ut + v4 4+ Au? 0?; 
Pe = u® + v® + w® + Auvw(u — v)(v — w)(w — 4); 


Pg =u +08 +8 + 22 du?v2w? 22, 


where u, v, w, z are linear functions and A is a number. 


4.8 Let the numbers qj,...,@, € C be such that }> a‘ is an integer for 
any integer m. Prove that all the coefficients of the polynomial [](# — a;) are 
integers. 


5 


Galois Theory 


5.1 Lagrange’s theorem and the Galois resolvent 


5.1.1 Lagrange’s theorem 


Let K be a field of characteristic 0 and y a rational function in variables 
1,..-,%, over K. Let S,, We denote the permutation group of n elements. 
We can assign to y the stabilizer of y, i.e., the group 


Gy = {o € Sn | P(Xo(1); one y@ein)) _ p(x, ne ->In)}- 


For example, if y is a symmetric function, then G, = S,, whereas if yp = 
>> a;x;, where the numbers a1,...,a, are distinct, then G,, contains only the 
identity permutation. 


Theorem 5.1.1 (Lagrange). Let y,~ € K(a1,...,%,) andGy C Gy. Then 
w = R(y), where R is a rational function whose coefficients are symmetric 
functions in 7,...,%p. 


First proof. Let us split Gy into non-intersecting cosets hiG, = Gy, 
hoGy, ..., heGy. To every coset hjGy there corresponds a function y;, the 
image of y under the action of this coset; clearly, yi # y; for i # j. The 
function ~ is Gy-invariant, and hence, a function 7; uniquely corresponds to 
the coset hiG; if G, A Gy, then among these functions some will coincide. 


k . 
The function S> ; vi is invariant with respect to the action of all per- 
i=1l — Pi 
mutations from S,,. Hence 


. 
vi _ Fi) 
Dia 


where 
Qt) = (t= pr) +... (E= r) 
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and F(t) is a polynomial in t whose coefficients are symmetric functions in 
L1,..-,Ln. Since y; A y; for i # J, it follows that Q’(y) #0. 
Clearly, 


Q(t Q(t) — 2(y; 
lim () = lim Re) = 1'(y;). 
topit—Qi toi t— Yi 
Therefore 
Q(t) 0 for pi # ¥; 
lim ——————. = 
te; Q'(t)(t — y;) 1 for y; = y. 
Hence 
Lis — =. 
>> Nee - re 
Second proof. Let us construct functions yy = ¢,...,~~ and yw, = 


w,...,Wpx as in the first proof. Clearly, 


So vidi =T, (1) 


is a symmetric function in 21, ..., %,. We consider the equalities (1) for 
s=0,...,k—1 as a system of linear equations for v1, ..., Wz. Solving this 
dD, 
system we obtain vy, = A? where 
1 1 To 1 1 
1 Pk Ti $2 Pk 
A= and dD, P 
Cee yet Ty ig are o* 


DA 
Let us express the identity obtained in the form ~ = aa Clearly, A? is a 


symmetric function. Under the transposition of any pair of functions yo, ..., 
(yr, both determinants D; and A change sign, and so D,A is a symmetric 
function in ya, ..., ye. Hence 


Di A = So + y1S, +-+++ Ce ae 


where So,...,S,-1 are symmetric polynomials in yo, ..., yx. Therefore So, 
, Sk—-1 are expressed in terms of 
01 = Yot-::+ Yr, 
G2 = payste-, 
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But, as is easy to verify, 


2 
02=02—- 71901 7 Yj; 


2 3 
03 = 03 — $192 T 191 — 1; 


where 01, o2 , ...are elementary symmetric functions in y1, ..., ~~. Hence 
1 ,---, Ok-1 are expressed in terms of a1 , ..., O¢—-1 and y,. Hence D, A 
is a polynomial in y, = y whose coefficients are symmetric functions in x , 

; nm 


Lagrange’s theorem has numerous corollaries. Let y and ~w be rational 
functions in 71,...,@. If W = R(y), where R is a rational function whose 
coefficients are symmetric functions in 21,...,2%,, we will briefly say that w 
is rationally expressed in terms of yp. 


Corollary 1 Any rational function in 21,...,%p 1s rationally expressed in 
terms of a1%1 +++: +4n%p, where aj,..., an are distinct numbers. 


Corollary 2 If Gz, = Gy, then the functions yp and w are rationally ex- 
pressed in terms of each other. 


Corollary 3 If the rational function r is invariant with respect to all the per- 
mutations that preserve the functions r1,...,1, then r is rationally expressed 
in terms of T1,.--;Tn- 


) 


Proof. We may assume that the functions r1,...,7, are linearly inde- 
pendent. Let gy = air, +--+ + Gnrn, where a),...,@, are distinct numbers. 
Then any permutation that preserves y should also preserve all the functions 
T1,---+,Tm. Therefore r is preserved under all the permutations that preserve 
vy. Hence, r is rationally expressed in terms of y = airy +++: + Ann. 


Corollary 4. Let the polynomial f(x1,...,%n) take only two distinct val- 
ues under all possible permutations of its variables. Then 


f=5,+ AS, 


where S, and Sz are symmetric functions and A= [| (#; — 2;). 
i<j 
Proof. If f is not a symmetric function, then the substitutions that pre- 
serve f form a subgroup of S,, of index 2. Therefore it suffices to prove that 
S,, has only one subgroup of index 2, namely, the alternating group A,,. Let 
G Cc S,, be a subgroup of index 2 and h € S,, \ G. Then S,, splits into the 
non-intersecting subsets G and hG = Gh. Therefore hGh~! = G and, if 
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hi, ho € S;,\ G, then hyhg € G. If G had contained a transposition (ij), then 
it would have contained any other transposition (pq) as well. Indeed, (pq) is 
obtained from (i7) by conjugation with any permutation that sends i to p and 
j to q. Therefore all the transpositions lie in S, \G, and hence their products 
lie in G. Thus, all the products of any even number of transpositions lie in G, 
and therefore G > A,. But |G| = |A,,|, and hence G = Ap. 


In Lagrange’s theorem we deal with the rational functions in 7,...,2pn. 
The passage from algebraically independent variables 71,...,x,, to the con- 
crete values of these variables requires certain caution. The point is that the 
permutations which preserve the value of y(a1,...,%n) for given 7,...,0n 
may not form a group. For example, let 


2nik 


11 = exp ( ) for k=1,...,6. 


Consider the function f(21,...,26) = 1%6. Let 
o = (12)(56) and 7 = (16)(23). 


The permutations o and 7 send f into of = x2%5 = 1 and Tf = rex, = 1 
respectively, i.e., both o and 7 preserve f. But 7 sends of = xox%5 to tof = 
XL3X5 # 1. 

To avoid such nuisances, Galois suggested considering not all the permu- 
tations of the roots of the equation but only the ones that preserve all the 
rational relations between them. 

More precisely, let f = 2” +an_,2"—!+---+ao be a polynomial with coeffi- 
cients from a field K and let aj,...,@, be the roots of f. Galois suggested con- 
sidering those permutations o that for any rational function r € K(21,...,%n) 
the identity r(a1,...,Q@n) = 0 implies the identity r(ag(1),.--,Q¢(n)) = 0. 

In modern terms this means that the permutation o corresponds to an 
automorphism of the field K(a1,...,Q@n) which preserves the ground field 
kK. The group of all such permutations is called the Galois group of the 
polynomial f (this group depends of course on the field kK). 

In the example considered above, the permutations o and 7 do not enter 


the Galois group of the polynomial x° + 2° +--.+ a+ 1 whose roots are 
%1,...,2%6. Indeed, the permutations o and 7 do not preserve, for example, 
the relations v2 = x? and x6 = 2%. 

For any rational function in the roots a1,...,@, of the polynomial f, we 


can consider the elements of the Galois group which preserve its value. Clearly, 
these elements form a group. Indeed, let o and 7 be the elements of the Galois 
group of f and let r(av,...,@n) be a rational function such that 


reli; Pees C065) = rei, ay On), (2) 


T Meta xicsQetel) ST (Gas cule): (3) 
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We can apply o and 7 to any rational relation between the roots ay1,...,Qn. 
Hence, applying 7 to the relation (2) we obtain 


T(Qro(1)s ae) Oeata) = TO e1y5 oa Oeetia) 


Relation (3) implies that to also preserves the value of r. 


5.1.2 The Galois resolvent 
In this section, let 
f(z) =a" +o-12""! +--+ +09 


be a polynomial over a field K of characteristic 0 and let a1,...,@, be its 
roots. Suppose that f has no multiple roots, i.e., the numbers aj,...,@, are 
distinct. Consider a rational function 


W(a1,---,2n) = M121 +--+ Mtn, 
where m1,...,™p are integers. Let us show that the numbers m1,...,m,, can 
be selected so that all the n! values Yo = Y(Qo(1),---,Q@o(n)) are distinct. 


Indeed, consider the function 


D(th, see itn) — [> tilecw _ Q7(%))s 


o,T i=1 


where the product runs over all unordered pairs of distinct permutations o 
and r. The function D is a product of nonzero polynomials in t),...,t,, and 
hence D is a nonzero polynomial in indeterminates t;,...,t, over K. But any 
nonzero polynomial function takes a nonzero value for certain integer values 
of its arguments t] = mj,,...,tn = Mn. These integers m41,...,™mn are the 
required ones. 

Before we advance further, we prove one auxiliary statement. 


Lemma. Any symmetric polynomial in the roots a2,...,Qn of f is poly- 
nomially expressed in terms of the root a, and the coefficients ag,...,Qn—1- 
Proof. Any symmetric polynomial in the roots az,..., Qn is expressed in 
terms of the coefficients of the polynomial 
oi 
(t—a@2)+...7(@— Om) = Le) = 2"! + bp_gt" 7 +++ + do. 
— ay 


Here we have 


aQn-1 = bn—2 — a1, 
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i.e., 
bn—2 an—1 1, 
2 
bn—3 = An—2 +T A1Gn-1 + Oy, 
2 3 
bn—4 = An—3 TF ALAn—2 TF A{An-1 ai ay; 
Thus, the coefficients bo,...,bn—2 are polynomially expressed in terms of the 
root a, and the coefficients ag,..., @n—1.- 
Select the numbers mj ,...,™, so that all the n! values 


Wo = M1Ag(1) +++ + Mn (n) 


are distinct and consider the polynomial 


F(z) = II (2 = m406(1) —+** — Mn (a})- 
cESy 


The coefficients of this polynomial are symmetric polynomials with integer 
coefficients in the roots of f, and hence they are rationally expressed in terms 
of the coefficients of f. Therefore, if f is a polynomial over K, then so is F’. 

Let us factorize F into a product of irreducible over K monic factors. Any 
such irreducible factor G is called a Galois resolvent of f. Clearly, all Galois 
resolvents are obtained from each other by permutations of the roots. Having 
numbered the roots we may fix the Galois resolvent, the one corresponding to 
the identity permutation. For definiteness sake, we will assume that G has a 
root 

w= may t--+ + Mn. 


Theorem 5.1.2 (Galois). Any root of f is rationally expressed (over K) in 
terms of one of the roots of G. 


Proof. Consider the polynomial 


F(a) = II (2 — mya — MzAg(2) — +++ — MnAg(n))- 
{oES,|o(1)=1} 


The coefficients of F; are symmetric polynomials in a2,...,@,; hence, by the 
Lemma, they are rationally expressed (over A’) in terms of ay, i.e., Fy(x) = 
g(a, a1), where g is a polynomial in two variables over K. Clearly, g(w,a1) = 
Fi (yp) = 0. 

Now consider the polynomial 


F(x) = [[@ ™1QAQ — M2Ag(1) 7 0 MnQo(n))> 
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where the product runs over the permutations o € S, such that o(2) = 2. 
From the proof of the Lemma we see that the coefficients of F> are the same 
as for F, up to replacement a, by ag, ie., Fo(x) = g(x, a2). 

By the hypothsis 


Y= may +--+ + Mndy F My AQ2 + MeAg(1) ++ +> + MnAe(n), 


Le., Fo(w) # 0. Therefore ay is the only common root of the polynomials f(z) 
and g(w, x). This means that the greatest common divisor of f(a) and g(w, x) 
is x — a,. But the greatest common divisor of two polynomials is found by 
Euclid’s algorithm, and so qa , is rationally expressed in terms of w and the 
coefficients of f and g, i.e., a, is rationally expressed over K in terms of w. 


Corollary. All the roots of the Galois resolvent are rationally expressed 
in terms of one of its roots. 


Proof. Every root of the Galois resolvent is of the form 
M1Ag(1) H++* + MnAg(n)- 


Clearly, they are rationally expressed in terms of a1,...,@,._ Inturn, a1,...,Qn 
are rationally expressed in terms of = = mya, +--+: + MyQy. 


The Galois resolvents are a convenient tool for constructing the splitting 
field of K(a1,...,Q,). We recall the definition. Let f be a polynomial over 
a field k without multiple roots (but not necessarily irreducible), and let 
Q1,--.,;Q@n be all the roots of f. The field K = k(a1,...,Q@n) is called the 
splitting field of f. 

Indeed, K(a1,...,Q@n) = K(w), ie., instead of adjoining to K all the 
roots of a polynomial, we can adjoin just one root of the Galois resolvent of 
the polynomial. 

Another application of the Galois resolvent is that it can be used to con- 
struct the Galois group. (It was precisely with the help of the resolvents that 
Galois initially constructed the Galois groups of the polynomials.) 

Let v1 (= W), we2,..., Wy be the roots of the Galois resolvent G. As we have 
shown, all of them can be rationally expressed in terms of w, ie., %; = Ri(w), 
where R; € K(x). The relation w; = R;(w) can be considered as a relation 
between the elements of the field A (w). Hence we may assume that R; is a 
polynomial of degree not higher than deg G — 1 = r — 1. This polynomial is 
uniquely determined. The formula w; = R;(w) remains valid if we replace R; 
by R; + aG, where a is an arbitrary polynomial. 

Consider polynomials G(x) and G;(x) = G(R;(x)). The coefficients of 
these polynomials lie in K and the polynomials have a common root w. Since 
by the hypothesis the polynomial G is irreducible, it follows that any root 
wy; of G is also a root of Gj, ie., Ri(~;) = wp for some p. This means that 
Ri(R(w)) = Rp), ie. RiRj = Rp, (mod G). Therefore, for any root vs 
of G, we have R;(R;(Ws)) = Rp(ws). In particular, the set of polynomials 
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R,,...,R, and the transformations of the roots w,..., 7, which correspond 
to them is invariantly defined, i.e., does not depend on the choice of the root 
v1. 


To the number 
w= M1Qg,(1) b+ + Mn; (n); 


a permutation o; uniquely corresponds, and to the permutation, in its turn, 
there corresponds an automorphism, which we also denote by o;, of the field 
K(a1,...,Q@n) = K(w). Thus, to the roots w1,...,w, of the Galois resolvent 
there correspond permutations 01,...,0,. To o;, we assign the polynomial R;. 
Since o;(w) = Wi, we see that 


40; () = o4(;) = 01 Rj (W) = Ry (vi) = By (Ri()), 


i.e., 010; — R;R;modG. Therefore the group of transformation of Ry,..., R,- 
is anti-isomorphic! to the group of permutations of o1,...,0,. In order to 
establish a relation with the Galois group of f, it remains to prove the following 
statement. 


Theorem 5.1.3. The Galois group of the polynomial f consists of the per- 
mutattons 01,...,0r. 


Proof. We have to prove that, if a,,...,@, are the roots of f and 7 € Sy, 
then the condition 7 € {o1,...,0,} is equivalent to the fact that 


p(a1,...,An) = 0 => Y(azq),---,Or(n)) = 0 


for any rational function y. 
Since the roots aj,...,@, can be rationally expressed in terms of 1, it 
follows that 


p(ai,..-,Qn) = &(1) = O(miay +--+ +mMnan), 
where ® € K(x). Therefore 
~(a7(1); LOS Q;(n)) = PD(,), 


where uw; = M1071) +++ -+MnO-(n)- The equivalence of the equalities @(~1) = 
0 and &(w,) = 0 for all possible rational functions means that w and uv; 
are roots of the same irreducible polynomial, i.e., 7 € {o1,..., a7}. 


Corollary. Let f be a polynomial over a field k with distinct roots 
Q1,.--;Q@n. Then the order |G| of the Galois group G of f is equal to [K : kj, 
where K = k(ay,...,Qn). 


Recall that the map of groups a : G —> H is an anti-isomorphism if a(fg) = 
a(g)a(f) for any f,g € G. 
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Proof. Both |G| and [K : k] are equal to the degree of any Galois resolvent 
of f. 


The following statement is of considerable importance for Galois theory. 


Theorem 5.1.4. Let a1,...,Q, be the roots of an irreducible polynomial f 
over k, and let p € k(a1,...,2n). 
a) Let yp(ay,..-,Qn) = Y(A5,(1)) +++, M%o;(n)) for any permutation o; in the 


Galois group. Then y(ai,..-,Qn) € k. 

b) Let H = {o;,,...,0i,} be a subgroup of the Galois group {o1,...,or} 
such that if p(a1,...,Qn) = P(Ae(1))+++1Ao(n)) for any permutation o € H, 
then p(a1,...,Qn) © k. Then H coincides with the whole Galois group. 


Proof. a) The roots a1,..., Qn can be rationally expressed in terms of 2, 
and hence y(a1,...,@n) = &(y), where @ € k(x). Therefore 


(5, (1); e208 s Sean) = (yi). 
Thus, P(q1) =--- = &(w,), and hence 


Bix) = +(Pd) +--+ By) 


is arational symmetric function in the roots W1,...,w, of the Galois resolvent. 
Hence y(ai1,.--,Q@n) = P(y1) Ek. 
b) Consider the polynomial 


g(x) = [J (@- o()) = (@- da) -.-.- (@— Y,). 
o€H 


Its coefficients are invariant with respect to the H-action, and hence all of 
them belong to k. Therefore the coefficients of g(a) lie in k and g(x) has 
common roots with an irreducible over k polynomial 


G(x) = (@— Y1) +... (@— Ur). 
Hence g(x) = G(a) and H = {o,...,o,r}. 


5.1.3 Theorem on a primitive element 


On page 187 we observed that the field k(a1,...,@n), where aj,...,Q, are 
the roots of a polynomial f, can be generated over k by one element, namely, 
by a root w of a Galois resolvent of f. Usually, to construct the element that 
generates the field, one uses the following standard construction. 


Theorem 5.1.5 (On a primitive element). Let k be a field of characteris- 
tic zero, and let aj,..., Qn be algebraic elements over k. Then k(ay,...,Qn) 
is generated over k by one element. 
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Proof. First, consider the case of two algebraic elements, a and (3. Let 
f(a) and g(x) be irreducible over & polynomials with roots a and {, re- 
spectively. Let further ay = a,az2,...,a, be all the roots of f and 6, = 
B, G2,..., 8s all the roots of g. Select c € k so that a; + cB; 4 a1 + cB; for 
j #1. Set 

O@=a,+ch, =at+cfH. 


Clearly, k(@) C k(a, @). It remains to show that k(a, 3) C k(0). To do this, it 
suffices to prove that 6 € k(6). Indeed, in this case a = 0 — cB € k(@). 
The element ( satisfies the equations 


g(x) =O and f(@-— cx) =0 


whose coefficients belong to k(@). The only common root of the polynomials 
g(x) and f(@—cz) is 6 since 6 — cB; A a; for 7 #1. The polynomial g(x) has 
no multiple roots, and so the greatest common divisor of g(a) and f(0 — ca) 
is x — G. The greatest common divisor of two polynomials over the field k(@) 
is a polynomial over k(6), and hence 6 € k(@). 

The passage from n = 2 to an arbitrary n is performed by an obvious 
induction: if k(ai,...,Q@n—1) = k(@), then k(ai,...,Qn) = k(0,an) = k(0’) 
for some 0’. 


With the help of Theorem 1.5 one can prove, for example, the following 
statement on prime divisors of a set of polynomials. 


Theorem 5.1.6 ({[Ho]). Let fi(x),..., f(x) be non-constant integer-valued 
polynomials, t.e., fi(m) € Z form € Z. Further, let M; be the set of all prime 
divisors of all the numbers f;(m) € Z, where m € Z and f;(m) #0. Then the 
set M = M,U---UM,, is infinite. 


Proof. First, consider the case n = 1 (in this case the theorem was proved 
by Schur [Sc6]). If fi() is an integer-valued polynomial of degree k, then all 
the coefficients of the polynomial k!f;(a) are integers by Theorem 3.2.1 on 
page 85. Passing from f, to k!f; we add to prime divisors of f; only prime 
divisors of the number k!, i-e., a certain finite set of divisors. Therefore it 
suffices to carry out the proof for the case of the polynomial f; with integer 
coefficients. 

Now fi assumes values 0 and +1 at finitely many points only. Therefore 
the values of f; at integer points have at least one prime divisor; in other 
words, M, 4 @. Suppose that M, = {pi,...,p,} is a finite set. 

Let a € Z and f(a) = b £0. Let us show that 


bpi-...* Dr 
re filat be Pr) 
is a polynomial with integer coefficients such that g(x) =1 (mod p,-...+ py) 


for all x € Z. It is easy to verify that if c € Z, then (a+ cx)! — a! = chi(z), 
where h;(x) is a polynomial with integer coefficients. Hence 
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fila+ bp.-...-p,px) — fila) = bp. -...- prh(a), 


where h(2) is a polynomial with integer coefficients. It remains to observe that 


(fila) + bp,-... - pph(x)) 


g(x) = 5 


=1+p.-...-p,h(x). 

For a certain x € Z, the integer g(x) has a prime divisor p. The congruence 
g(x) = 1 (mod p, -...+ py) shows that p € M, = {pi,...,p,-}. On the other 
hand, the number fi(a@ + bp, -...-p,x) = bg(x) is divisible by p, and hence 
p€ M,. The contradiction obtained means that M, is an infinite set. 

The passage from n = 1 to an arbitrary n is performed with the help of 
Theorem 1.5. As in the proof for n = 1, we assume that fi,...,fn € Z[a]. 
Let a; be one of the roots of f;. By Theorem 1.5, Q(a1,...,@n) = Q(a), ie., 
a; = y;(a), where yi(t) € Qlt]. Let g be an irreducible polynomial over Q 
with root a. If y;(0) 4 0, then replace y; by %;, where 


~ yi(0) 
Pit) = yilt) - —~g(t). 
g(0) 
Thus, we may assume that y;(0) = 0. In this case, if a number N is divisible 
by the denominators of all the coefficients of y;, then y;(Nt) € Z[t). 
The polynomials f;(yi(t)),---, fn(Yn(t)) € Q[t] have a common root a, 
and hence all of them are divisible by g(t). Consider the polynomials 


F,(t) = fi(yi(N2)). 


Clearly, F;(t) € Z[a] and F;(t) is divisible over Q by g(Nt), ie., Fi(t) = 
g(Nt)gi(t), where g;(t) € Qt]. Let us express the polynomials g and g; in the 
form 


g(Nt) = rh(t) and gi(t) = s;h,(t), 


where r,s; € Q, A(t), hi(t) € Z[t] and cont(h) = cont(h;) = 1. 

Then F;(t) = rs;h(t)h,(t) and, by Gauss’s lemma, rs; = cont(F;) is an 
integer. This means that F;(t) is divisible over Z by h(t). In particular, all 
the divisors of the values of h at integer points are divisors of the values of F; 
which, in turn, are divisors of the values of f;. Therefore the set M contains 
an infinite subset consisting of prime divisors of the polynomial h. 


5.2 Basic Galois theory 


5.2.1 The Galois correspondence 


Theorem 5.2.1. Any element w € K = k(aj,...,Qn) is a root of an irre- 
ducible over k polynomial h all of whose roots belong to K. 
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Proof. One can represent w in the form w = g(a1,...,Qn), where g € 
k[a1,...,2n]. Set we = g(Ae(1),+--,Qg(n)) and consider the polynomial 
h(a) = II (% — We). 
cESn 


Clearly, all the roots of h belong to K and h is a polynomial over k. Hence w 
is a root of an irreducible divisor of h. 


Corollary. If p(x) is irreducible over k and one of its roots belongs to 
K =k(ay,...,Qn), then all the other roots of p(a) belong to K. 


Proof. Let w € K be a root of p and let h be an irreducible over k 
polynomial all of whose roots belong to K and h(w) = 0. The polynomials h 
and p are irreducible over k and have a common root w. Hence all the roots 
of p are the roots of h, i.e., belong to K. 


A finite extension K of the field & is called a normal extension or a Galois 
extension if any irreducible over k polynomial, one of whose roots belongs to 
K, factorizes over K into linear factors, i.e., all its roots lie in K. 

An example of a non-normal extension is the field Q(‘/2). This field con- 
tains only one of the roots of the polynomial x? — 2. 

By Theorem 5.2.1, the splitting field of any polynomial is a normal exten- 
sion. The converse statement is also true. 


Theorem 5.2.2. Let K D k be a normal extension. Then K is a splitting 
field of a polynomial over k. 


Proof. Let K = k(aj,...,Q,) and let f; be an irreducible polynomial 
over k with root a;. Let f = fi--+ fn, and let K’ be the splitting field of f 


over k. On the one hand, the elements a1,...,Q@, are all the roots of f, and 
so kK Cc K’. On the other hand, K is a normal field over k and contains a root 
of an irreducible over k polynomial f; (¢ = 1,...,n), and so K contains all 


the roots of f, and therefore K D K’. 


Corollary. Let K > LD>k, where K is a normal extension of k, and L 
an arbitrary intermediate field. Then Kk is a normal extension of L. 


Proof. The field K is the splitting field of a polynomial f over k. This 
polynomial can be considered also as a polynomial over L, and so K is the 
splitting field of a polynomial over L, i.e., K is a normal extension of L. 


If K is a normal extension of k, then the Galois group of K over k is the 
group of automorphisms of K preserving all the elements of k. The Galois 
group of K over k will be denoted by the symbol G(K, k). If K is the splitting 
field of a polynomial f, we will also denote the Galois group G(K,k) by the 
symbol G;,(f). 

The elements w and w’ of the field K 5 k are called conjugate over k if w 
and w’ are the roots of the same polynomial irreducible over k. 
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Theorem 5.2.3. Let K be a normal extension of k. The elements w,w’ € K 
are conjugate over k if and only if there exists an automorphism o € G(K,k) 
which sends w into w’. 


Proof. Let w be a root of an irreducible over k polynomial p. If w’ = o(w), 
then p(w’) = p(o(w)) = o(p(w)) = 0, and therefore w and w’ are conjugate 
over k;. 

Now suppose that w and w’ are the roots of an irreducible over k poly- 
nomial p. To construct the automorphism o, we start by constructing an 
isomorphism y: k(w) > k(w’). Any element of the field k(w) can be uniquely 
represented in the form 


dp tayw+-+++ de ee where a; € k and n = deg p. 


n-1 : n—1 
The automorphism to be constructed is of the form >> ajw' > YS aj(w’)’. 
1=0 i=0 


Now, select 6 € K \ k(w). Let p; be an irreducible over k polynomial with 
root @ and qy an irreducible over k(w) divisor of p; such that qi(@) = 0. Under 
the isomorphism y: k(w) — k(w’) the irreducible over k(w) polynomial q 
becomes an irreducible over k(w’) polynomial 7. Let 6’ be a root of G,. The 
isomorphism y : k(w) — k(w’) can be extended to an isomorphism k(w,0) > 
k(w’, 0’). This isomorphism is of the form }> b;6° + >> y(b;)(0’)’, where b; € 
k(w). Such extensions of isomorphisms enable one to construct an isomorphism 
of K with a subfield K’ C K that sends w to w’. This isomorphism of fields 
is, in particular, an isomorphism of linear spaces over k. Therefore, since the 
dimension of K is finite, it follows that K’ = K, i.e., we have obtained an 
automorphism of K. 


Corollary. If K is a normal extension of k, then the element w € K is 
invariant with respect to the action of the Galois group G(K,k) if and only if 
week, 


Proof. If w € K is invariant with respect to G(K,k), then all its conju- 


gates coincide with it. This means that w is a root of the polynomial x — w 
with coefficients in k, i.e., w € k. 


The property that the extension be normal is very essential. For example, 
any automorphism of the field Q(1/2) is the identity, i.e., the element 7/2 ¢ Q 
is invariant under all the automorphisms. 

A particular feature of normal extensions is that their group of automor- 
phisms is rather large. This makes it possible to establish a one-to-one corre- 
spondence between the intermediate fields and subgroups of the Galois group. 

Let K be a normal extension of k. Consider an arbitrary intermediate field 
L, ie., k C EC K. By Corollary of Theorem 5.2.2 the field K is a normal 
extension of L, and therefore we can consider the Galois group G(K, L). 
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Theorem 5.2.4 (The Galois correspondence). a) There is a one-to-one 
correspondence between the intermediate fields k C L C K and the subgroups 
of the Galois group G(K,k). To the field L, there corresponds the subgroup 
G(K,L) Cc G(K,k) and to the subgroup H C G(K,k) there corresponds the 
field consisting of the H-invariant elements of K. 

b) An intermediate field L is a normal extension of k if and only if the 
subgroup G(K,L) C G(K,k) is normal. In this case, we have the exact se- 
quence 

0 — G(K, L) — G(K,k) > G(L,k) > 0. 


Proof. a) Any automorphism of the field K that preserves the elements 
of L also preserves the elements of k C L, i.e., G(K,L) C G(K,k). 

To the field LZ, there corresponds the group G(K,L) and to the group 
G(K, L) there corresponds a field L’ consisting of the elements of K invariant 
with respect to all the automorphisms of K that preserve the elements of 
L. Clearly, L’ > L. But for the case of normal extensions any element of K 
invariant with respect to G(K, L) belongs to L (Corollary of Theorem 5.2.3). 
(An independent proof follows from Theorem 5.1.4 (a) on page 189.) 

Therefore L = L’, i.e., to every subfield there corresponds a subgroup and 
this subgroup uniquely determines the subfield. 

To the subgroup H Cc G(K,k) there corresponds a field L, and to the field 
L there corresponds the subgroup G(K, L) = H’' consisting of the automor- 
phisms of kK that preserve the elements invariant with respect to H. Clearly, 
H’' > H. But if the subgroup H of G(K, L) is such that all the elements of 
K invariant with respect to the H-action lie in L, then H coincides with the 
whole Galois group G(K,L) (Theorem 5.1.4 (b) on page 189). Therefore to 
each subgroup there corresponds a subfield and this subfield uniquely deter- 
mines the subgroup. 

b) Let Z be a normal extension of k. Then any automorphism of kK over 
k sends L into itself (the element of L goes into a conjugate element which 
again belongs to L). Therefore there is a homomorphism G(K,k) > G(L,k). 
Clearly, the group G(K, L) is the kernel of this homomorphism, and hence it 
is a normal subgroup of G(K,k). 

Now, let G(K, L) be a normal subgroup of G(K,k), ie., if ¢ € G(K,L) 
and w € G(K,k), then W-'pw € G(K,L). Let a € L. We have to prove 
that all the elements a1,...,a; conjugate to a belong to L. By the hypothesis 
aj,...,a € K and a; = y;(a) for some ~; € G(K,k). If yp € G(K,L), then 
v; '~; € G(K, L). Therefore w;pyi(a) = a, ie., p(ai) = aj. Hence a; € L. 

The description of the exact sequence 


0 > G(K,L) > G(K,k) — G(L,k) — 0 
' Recall that a sequence of maps --- > ASBAC 5... is exact in B if Im(a) = 


Ker(@). The sequence is exact if it is exact in all its terms except the first and 
the last. 


5.2 Basic Galois theory 195 


is as follows. The field L is a normal extension of k, and so any automorphism 
yp € G(K,k) preserves L, and therefore one can consider restrictions of 
onto L. This gives rise to the homomorphism G(K,k) > G(L,k). Its kernel 
consists of the automorphisms of K identical on L. They constitute the group 
G(K, L). The epimorphic nature of the homomorphism G(K,k) > G(L,k) 
follows from the fact that any automorphism of L of k can be extended to an 
automorphism of K over k. 


If L is the splitting field of g over k and K is the splitting field of f over 
k, then the exact sequence 


0 > G(K,L) > G(K,k) > G(L,k) - 0 


takes the form 
O- Gi) — Galt) > Gig) > 0. 


Theorem 5.2.5. Let L be an arbitrary extension of the field k and let f be a 
polynomial over k. Then the Galois group Gi (f) ts isomorphic to a subgroup 


of Gx(f)- 


Proof. Let a1,...,Qn be all the roots of f. The automorphism o € Gz(f) 
permutes the roots a ,,...,@», and preserves all the elements of L D k. There- 
fore o preserves the field k(ay,...,Qn), ie., to o we may assign the automor- 
phism & € G;(f) which is the restriction of o onto k(a1,..., Qn). 

If o = id, then o preserves all the roots a1,...,@,. Moreover, by definition, 
o preserves all the elements of L, and hence o = id. Therefore the map 0 —- 7 
is a monomorphism, i.e., Gz(f) is isomorphic to a subgroup of G;(f). 


5.2.2 A polynomial with the Galois group S5 


In order to give an example of equation not solvable by radicals, we need a 
polynomial whose Galois group is S;,,, n > 5. We are ready to prove that the 
Galois group of the polynomial x° — 42 + 2 over Q is equal to 95. 

The subgroup G C S,, is transitive if for any two indices 7,7 € {1,...,n} 
there exists a permutation 0 € G such that o(7) = j. 


Theorem 5.2.6. The polynomial f without multiple roots is irreducible if and 
only if its Galois group is transitive. 


Proof. Let a1,...,Q@, be the roots of f. If f is irreducible over k, then by 
definition all its roots are conjugate to each other. Hence, by Theorem 5.2.3, 
there exists an automorphism of the field k(ay,...,@m) that sends a; to a;. 

Now, suppose that the Galois group G of f is transitive. Let f; be an 
arbitrary divisor of f over k and a; a root of f;. Take an automorphism 
o; € G such that o;(a,) = a;. Under o; the relation fi(a1) = 0 becomes 
fi(ai) = 0, ie., all the roots of f serve also as roots of f;. Since f has no 
multiple roots, we deduce that f; is divisible by f, i.e., f is irreducible. 
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Theorem 5.2.7. Let n be a prime and let the transitive subgroup G C Sy 
contain at least one transposition (i, 7). Then G= Si. 


Proof. On {1,...,n}, introduce a relation by setting i ~ 7 if either « = 7 
or G has transposition (7,7). The identity (i,7)(j,&)(i, 7) = (i,k) proves that 
this relation is an equivalence relation. 

Let E(i) be the equivalence class containing 7. Let us show that |E(i)| = 
|Z (j) |, i.e., all the equivalence classes consist of the same number of elements. 
Since G is transitive, it has an element o such that o(i) = j. Let a € E(t), 
ie., (i,a) € G. The element o - (i,a) -o~+ interchanges o(a) and o(i) leaving 
the remaining elements fixed, i.e., 


a-(i,a)-o~' = (a(t), 0(a)) = (j,0(a)) EG. 


Therefore o(E(i)) C E(j), and hence |E(é)| < |E(j)|. The inequality 
|E()| < |E(A)| is similarly proved. 

By the hypothesis n is a prime, and so there is precisely one equivalence 
class. This means that G = Sy. 


Theorem 5.2.8. Let f be an irreducible polynomial over Q of prime degree 
p with precisely two non-real roots. Then the Galois group of f over Q is Sp. 


Proof. Since f is irreducible, its Galois group G C S, is transitive. The 
above theorem shows that to prove the statement desired it suffices to prove 
that G contains a transposition. An example of such a transposition is given 
by the restriction of the complex conjugation z+ Z onto the splitting field of 
f. Indeed, under the complex conjugation, all the real roots of the polynomial 
remain fixed whereas the two complex roots are interchanged (this implies, in 
particular, that the splitting field transforms into itself). 


The polynomial f(a) = «° — 42+ 2 is an example of an irreducible polyno- 
mial of degree 5 with precisely two complex roots . The irreducibility of this 
polynomial follows from Eisenstein’s criterion. The number of real roots of f 
is not less than 3 since 


fe JOS. fo) Fees. 


On the other hand, it cannot have more than three real roots because oth- 
erwise the derivative f’(x) = 52+ — 4 would have had more than two real 
roots. 


5.2.3 Simple radical extensions 


The splitting field kK of the polynomial x” — c, where c € k, is called a simple 
radical extension of k. 


5.2 Basic Galois theory 197 


Theorem 5.2.9. a) The Galois group G(K,k) of a simple radical extension 
is solvable. 

b) Ifk contains an n-th primitive root of unity, then G(K,k) is a subgroup 
of the cyclic group Z/nZ. 

c) Ifc=1, then G(K,k) is a subgroup of the multiplicative group (Z/nZ)*. 


Proof. a) Let a be a root of the polynomial 2” —c and ¢ an nth primitive 
root of unity. Then all the roots of x” — ¢ are of the form a,ac,...,ae"—1, 
and therefore K C k(a,e). On the other hand, ¢ = (ae)a~! € K, and so 
k(a,e) C K,ie., K = k(a,¢). 

Let K = k(a,¢). Then o(e) is a root of «” — 1, ie., a(€) = €%. Observe 
that e® cannot be a root of x” —1, where m < n, since otherwise the element 
é =o +(e%) would also have been a root of x” — 1, which contradicts the fact 
that € is primitive. Thus, (a,n) = 1. 

The automorphism o is completely determined by its values on the gen- 
erators of K, ie., o(€) = €%, o(a) = ea. Therefore such an automorphism o 
can be denoted by the symbol 


o = [a,b], where a € (Z/nZ)* and b € Z/nZ. 
Observe that if o; = [a;, bj], i = 1,2, then 
O1 (c2(e)) = etre | onl (o2(a)) = etrbatbi 


i.e., 
(a1, bi ][a2, b2] = [a1a2, a1b2 + dy]. 


Consider a homomorphism y: G(K, k) > (Z/nZ)* which assigns o = [a, b] 
to a € (Z/nZ)*. The kernel of this homomorphism consists of the ele- 
ments of the form [1,b]. For such elements the composition law is as follows: 
(1, by][{1, b2) = [1,b1 + bg]. The kernel and the image of ¢ are abelian groups, 
and so G(K,k) is solvable. 

b) Ife € k, then o(¢) =. Hence o = [1,}] € Z/nZ. 

c) If c= 1, we may assume that a = 1. In this case o = [a,0] € (Z/nZ)*. 


5.2.4 The cyclic extensions 


A normal extension K of the field k is called cyclic if the Galois group G(K, k) 
is cyclic. 


Theorem 5.2.10. [fk contains an nth primitive root of unity, then any cyclic 
extension K > k of degree n is of the form K = k(), where B is a root of a 
polynomial x” — c irreducible over k. 
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Proof. We will need the fact that the characters of the group are linearly 
independent. Let us recall the precise formulation and proof. Let G be a 
group, K a field and K* the multiplicative group of the field, i.e., the set 
of non-zero elements with respect to multiplication. A character of G is an 
arbitrary homomorphism G — k™*. 


Lemma. Distinct characters of G are linearly independent over K. 


Proof. Let {1,.--,;Y%n} be a minimal non-empty set of linearly dependent 
characters, i.e., 

A171(g) + +++ + AnYn(9) = 0 (1) 
for all g € G and certain fixed \4,...,An ©€ K*. Clearly, n > 2. The 
characters y; and 7, are distinct, and so y1(h) 4 Yn(h) for some h € G. 
Let us multiply (1) by y,(h) and subtract from the result the identity 
Aiyi(hg) +++: + AnIn(hg) = 0. After simplification we obtain 


At (Yn (h) — y1(h)) 1g) + +++ + An—1 (9n(h) — Yn—-1(h)) n-1(g) = 0. 


This contradicts the minimality of the set {y1,..., Yn}. 


If o is an automorphism of K, then the restriction of o onto K™* is a 
character of K*. Therefore, if o1,...,0, are distinct automorphisms of kK and 
Q1,---,Qn € K*, then ayoi(a)+---+anon(a) 4 0 for acertaina € K* cK. 

Let us now pass to the proof of the theorem. Let ao be the generator of the 
cyclic group G(K,k) and € an nth primitive root of unity which belongs to k. 
Consider the Lagrange resolvent 


(€,a)o =a+ €0(a) 4... gg" (a), 


The automorphisms id,o,o?,...,0°~! are distinct, and so there exists an 
element a € K for which (¢,a), = 3 4 0. It is easy to verify that (3) = «—!@ 
and o(8”") = (o(8))" = 6". Therefore o(3) # G, i.e., 8 ¢ k and o*(B") = 8” 
fori=1,...,n—l,ie, G™*=ceEk. 

Consider the polynomial 


zs” —c=(z— 6)(2—eB)-...+(a—e""18). 


The field k() is its splitting field and k(@) C K. Since the automorphisms 
id,o,o7,...,0”~! are distinct automorphisms of k(), it follows that the order 
of the Galois group of k(3) over k is not less than n, ie., [k(G) : k] > n. On 
the other hand, [K : k] = n, and so k(3) = K. The Galois group of the 
polynomial x” — c is transitive, and so the polynomial is irreducible. 
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5.3 How to solve equations by radicals 


An extension L of the field k is said to be radical if there exists a sequence of 
intermediate fields 
k=DIoCiC::-CL=L, 


such that L; = L;-1(Gi), where 3; € L;_1. In other words, we consecutively 
add to the field k the roots of the elements of the fields obtained at the 
preceding step. 

Let f be an irreducible polynomial over k and ay,...,@, all its roots. The 
equation f(x) = 0 is said to be solvable by radicals if the field k(a1,...,@n) 
is contained in a radical extension of k. 


To formulate and prove the criterion of solvability of equations by rad- 
icals, we will need the notion of solvable groups. We therefore start with a 
recapitulation of the basic notions of solvable groups. 


5.3.1 Solvable groups 
A group G is called solvable if there exists a sequence of nested subgroups 
{e} =G, C Gp-1 C++: CGo =G, 


such that G; is a normal subgroup in G;_1 and the quotient G;_1/G; is abelian 
(for t= 122057) 

In what follows we only deal with finite groups. 

For any finite abelian group G, one can construct a sequence of nested 
subgroups for which all the quotient groups G;_1/G; are cyclic (moreover, 
the G_1/G; are cyclic groups of prime order). Therefore, in the definition of 
a solvable group, we may assume that all the quotients G;_1/G; are cyclic. 

In the description of the relation between solvability of a polynomial 
equation f(a) = 0 by radicals and the solvability of the Galois group 
of f, we use the Galois correspondence which provides an exact sequence 
0—- H—~G—G/H-—0. Usually, for some of these three groups, it is known 
that they are solvable and we have to decide whether the remaining groups 
are solvable. For this purpose, we use the following theorem. 


Theorem 5.3.1. a) Any subgroup H of a solvable group G is solvable. 

b) Let H be a normal subgroup of G such that H and G/H are solvable. 
Then the group G is solvable. 

c) Let H be a normal subgroup of a solvable group G, then G/H is solvable. 


Proof. a) Let us show that the sequence of subgroups H; = G; 1 H 
possesses the properties required, i.e., H; is anormal subgroup in H;_, and the 
quotient H;_1/H; is abelian. Since G; C Gj-1, it follows that H; = Hj_1NGi. 
Therefore H; is a normal subgroup in H;_; and 
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Ay-1/H; = Hi-1/(Hi-1 9 Gi) = Gi Hi-1/Gi C Gi-1 /Gi. 


b) For solvable groups H and G/H, take the sequences that define their 
solvability: 


{e} = H, C An-1 cC::-C Ho = A, 
{H} = Am C Am-1 C++: C Ao = G/H. 


We define G; = p~1(A;), where p: G — G/H is the natural projection. 
Clearly G,, = H and Go = G. Let us show that the sequence of subgroups 


{fe} =A, C An-1 C++: CHp =H=GnCGm-1C-::C Go =G 


possesses the properties required in the definition of a solvable group, i.e., G; 
is a normal subgroup in G;_1 and Gj_1/G; is abelian. 
The second property follows from the fact that G;-1/G; = Aj_-1/Ai. 
Let g; € G; and gi-1 € Gj_1. Then 
-1 = = 
P(9i-19:9:1-1) = P(Gi-1)* P(9:) P(Gi_-1) € As, 


since A; is a normal subgroup in A;_;. Therefore I-19: 9i—1 € Gj, ie., G; is 
a normal subgroup in G;_1. 
c) For a solvable group G, take the sequence of subgroups that defines 
solvability: 
{e} =G, C Gp-1 C++: CGo =G, 


and define A; = G;H/H. Then the sequence of subgroups 
{H} =A, Cc Ap-1 c ---C Aj =G/H 


possesses the property required, i.e., A; is a normal subgroup in A;_; and 
Aj_1/A; is abelian. Indeed, let gi € A; and gi-1H € A;_1. Then 


(9-1) 9H (gi-1H)* = 9:1 giH gi) = 919197 1H € Ai. 
Moreover, 
Apa [Ay = Gaaj/(Gib 0 Gea) = (Gir /Gs) /( (G:F 1Gi1)/G)), 


ie., the group A;-1/A; is isomorphic to a quotient of an abelian group, and 
hence A;_;/A; is an abelian group. 


5.3.2 Equations with solvable Galois group 


Using the Galois correspondence and Theorem 5.2.10 on the structure of the 
cyclic extension, it is not difficult to prove that any equation with solvable 
Galois group is solvable by radicals. 
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Theorem 5.3.2. Let f be a polynomial without multiple roots over k whose 
Galois group G;(f) is solvable. Then the equation f = 0 is solvable by radicals. 


Proof. If the field & does not contain any dth primitive root of unity, 
where d = IG (Sf), then we add to k a dth primitive root ¢ of unity, i-e., 
we consider the field L = k(¢). The Galois group Gz (f) is isomorphic to a 
subgroup of the solvable group G;(f), and so Gz(f) is solvable itself and 
|G uf )| divides d. In particular, the field L contains primitive roots of unity 
of any degree that divides |Gz(f)|. 

For the solvable group Gz(f), we construct a sequence of subgroups 


{e} = Gp C++ C Go = Gif), 


in which the quotients G;_1/G; are cyclic. The Galois correspondence assigns 
to this sequence a sequence of fields 
K=L1,3D-::-DIDo=L 


d 


where K is the extension field of f over D such that the extension LD; > L;_ is 
normal, and therefore the sequence of fields K > L; > L;_1 provides an exact 
sequence 

0— G(K, L;) =p G(k, Ly-1) > G(Li, Dj-1) — 0. 


Therefore G(L;, Lj-1) = G(K, Li_-1)/G(K, Li) = Gi_-1/G; is a cyclic group 
whose order divides |G L( al Since L;_1 > L contains a dth primitive root of 
unity, where d; = |G(Li, L-1)|; we see that L; = Lj-1(3;), where (3; is a root 
of the polynomial x” — c;, where c; € L;_1. Therefore L; is a radical extension 
of L;-;, and hence K is a radical extension of L. It is also clear that L is a 
radical extension of k. Therefore the splitting field of the polynomial f is a 
radical extension of k, i.e., the equation is solvable by radicals. 


5.3.3 Equations solvable by radicals 


We have just proved that, if the Galois group of an equation is solvable, then 
this equation is solvable by radicals. Let us now prove the converse statement. 


Theorem 5.3.3. Let f be a polynomial without multiple roots over a field k 
and let the equation f = 0 be solvable by radicals. Then the Galois group 
G,(f) is solvable. 


Proof. By the hypothesis, for some field Z containing all the roots of f, 
there exists a sequence of fields 


L=L,5---D Io =k, (1) 


such that L; = L;-1(G;), where 37" € Lj_1. Here the extension L D k is not 
necessarily normal, and therefore we cannot directly apply the Galois corre- 
spondence. We therefore begin with the construction of a radical extension 
K > L for which the extension K 5D k is normal. 
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We use induction on r. For r = 0, the statement is obvious. Hence, by 
the inductive hypothesis, we may assume that we have already constructed a 
radical extension K,_,; D> L,—, for which the extension K;_; > k is normal. 
Set K’ = K,_; and L’ = K’(,.) > L. The induction step consists in the proof 
of the following statement. 


Lemma. Let K' > k be a normal extension and let L’ = K'(G), where 
B" © K'. Then there exists a radical extension K D L’ such that the extension 
K Dk is normal. 


Proof. Consider an irreducible over k polynomial g(x) with root 3” € K’. 
The extension K’ > k is normal, and so all the roots of g(x) lie in K’. Set 
h(a) = g(a”) and consider the field K, the splitting field of h over K’. Let us 
prove that K possesses all the properties required. 

1. The extension K Dk is normal. Indeed, the extension K’Dk is normal, 
and hence K’ is the splitting field over k of a polynomial (a). In this case kK 
is the splitting field over k of the polynomial h(x)y(«). 

2. K > L' = K'(8). Indeed, by definition K’ C K. We also have 3 € K 
since h(3) = g(8") = 0. 

3. The extension K > L’ is radical. Let 3 be a root of h(a). Then Bisa 
root of g(x), and hence 8” € K’ C L’. It remains to observe that the field K 
is obtained by adjoining to L’ all the roots B of the polynomial h(z). 


Thus, we may assume that the field Z in (1) is a normal extension of k. 
Moreover, we may assume that the numbers n; are primes and the degree of 
the extension L; > L;_1 is equal to n; (the adjoining of a root of degree pg can 
be replaced by the adjoining of a root of degree p with a subsequent adjoining 
of a root of degree q). 

Since the extension L D k is normal, the extension L D L;_, is also 
normal. The coefficients of the polynomial x” — 6; belong to L;-1 and its 
root 2; belongs to L but not to L,_;. Therefore the polynomial x”‘ — 3;" has 
an irreducible over [;_; divisor with root 3; and this divisor is different from 
x — 3;. Hence the field L contains a root of the polynomial 2” — 67" distinct 
from (;, and therefore the field L contains an n;th primitive root of unity (we 
use the fact that n; is a prime). 

The field L contains primitive roots of unity of all degrees n;, and so it 
contains a primitive root ¢ of unity whose degree is divisible by all the n;. Set 
L', = L;(e) and consider the sequence of subfields 


b= Ll, > Lo Ss DL = Lele) De = 
From the Galois correspondence we obtain a sequence of subgroups 


{e} = G(L, L!) C G(L, Li_1) C--- C G(L, Lo) = G(L, k(e)) C G(L,k). 
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The extension Li > L'_, is normal because L‘ is the splitting field over L/_, 
of the polynomial «”' — G7"'. Therefore the sequence of fields Li_, Cc Lic L 
provides us with an exact sequence of groups 
Thus, G(L, Li_,)/G(L, Li) = G(Li, L'_,) is a cyclic group of order n;. Hence 
G(L,k(e)) is a solvable group. 

The next step is the proof of solvability of G(L,k). The extension k(e) D k 
is normal, so for the sequence of fields k C k(e) C L we obtain an exact 
sequence of groups 


0 > G(L,k(e)) > G(L,k) > G(k(e), k) — 0. 


The group G(k(e),k) is abelian by Theorem 5.2.9 (c) on page 197, and so 
G(L,k) is solvable. 

The last step is the proof of solvability of the group G,(f) = G(N,k), 
where N is the splitting field of the polynomial f over k. The sequence of 
fields kc NCL yields an exact sequence of groups 


0 G(L, N) > G(L,k) > G(N,k) > 0. 


Therefore G(N,k) is a quotient of the solvable group G(L,k), and hence is 
solvable itself. 


Example. The equation 
gp? —d4¢+2=0 
is not solvable by radicals. 


Proof. The Galois group of the polynomial «° — 4r+2 = 0 over Q is equal 
to Ss. (See page 196). It remains to prove that S's is non-solvable. Observe, 
first of all, that if H C G is a normal subgroup such that the group G/H is 
abelian, then for any x,y € G the element ryx~'y~! belongs to H. In Ss, 
there is a normal subgroup As; consisting of the even permutations. It is easy 
to verify that any element of As can be represented in the form xyx~'y~', 
where x,y € As. Indeed, any element of As is either a cycle of length 5, or a 
cycle of length 3, or the product of two transpositions (7j)(kl) with distinct 
i,j, k,l. 

For the cycle (12345), set a = (12534) and y = (12)(35). 

For the cycle (123), set « = (123) and y = (23)(45). 

For (12)(34), set # = (14)(23) and y = (123). 

Therefore, if H C Ss is anormal subgroup and the group S;/# is abelian, 
then H = As (or Ss). But already As has no normal subgroup K such that 
As/K is abelian. 
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5.3.4 Abelian equations 


In the memoir “On a particular class of algebraically solvable equations” Abel 
proved three important statements concerning solvability of equations by rad- 
icals. 

1. If one of the roots of an irreducible polynomial f can be rationally 
expressed in terms of the other root, then the solution of the equation f(a) = 0 
reduces to the solution of several equations of lesser degrees. 

2. If the roots of an irreducible polynomial f are of the form 21, 0(x1), 
6?(21) = 0(0(21)),...,0"-1(a1), where @ is a rational function such that 
6” (a1) = x1, then the equation f(x) = 0 is solvable by radicals. 

3. If the roots of an irreducible polynomial f are of the form 21, 62(x1), 
63(@1),..-,0n(@1), where the 6; are rational functions such that 6;0;(@1) = 
6;0;(x1), then the equation f(x) = 0 is solvable by radicals. Moreover, if 
deg f = p}'---p,*, then the solution of f(x) = 0 reduces to solution of ny 
equations of degree p1, n2 equations of degree po, etc. 

The polynomial f in statement 3 is called abelian, and the equation f(x) = 
0 is called an abelian equation. Clearly, the Galois group of a polynomial g 
is abelian if and only if the Galois resolvent of g is an abelian polynomial. 
Therefore Abel’s theorem is a particular case of the Galois theorem which 
reads as follows: 


any equation with an abelian Galois group is solvable by radicals. 


Nevertheless, the methods of Abel’s theorem still retain a certain signifi- 
cance because Abel’s solution of Abel equations is rather constructive. 

The polynomial f in statement 2 is called a cyclic abelian polynomial. The 
Galois group of such a polynomial is cyclic. 

To solve cyclic abelian equations, Abel applied methods developed by La- 
grange and Gauss. His contribution consists in the fact that he separated the 
most general class of equations to which these methods are applicable. More- 
over, studying the theory of elliptic functions, Abel found a new interesting 
example of a cyclic abelian equation, namely, the lemniscate division equation. 
For a modern proof of the fact that the lemniscate division equation is a cyclic 
abelian equation, see [Pr3]. 

Let us begin with statement 1. It deals with polynomials of a particular 
form but with an arbitrary Galois group. Indeed, all the roots of the Galois 
resolvent of any polynomial can be rationally expressed in terms of one of the 
roots. 


Theorem 5.3.4. a) Let f be an irreducible polynomial over a field k of char- 
acteristic zero, one of whose roots can be rationally expressed in terms of 
another root. Then all the roots of f can be organized into a table (elucidated 
in the course of the proof) 
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vy, © =O(2}), ..., 2, =6?"(2t), 
ay’, xy = O(a7"), ..., ap = OP" (at"), 
where @ is the rational function such that 0?(x') =a fori=1,...,m. 


b) The problem of solving the equation f = 0 reduces to solving an equation 
g = 0 of degree m with coefficients in k and cyclic abelian equations hy = 
0,..-,;Am = 0, where h; is a polynomial of degree p whose coefficients are 
rationally expressed (over k,) in terms of a root y; of g. 


Proof. a) Let 11,...,% be the roots of f and let 22 = 6(x1), where @ is 


a rational function. Consider the polynomial y(x) = [] (x — 0(2x;)). The co- 
i= 

efficients of this polynomial are symmetric functions in 71,...,%,, and hence 

belong to k. The polynomials f and y have a common root #2 = 0(x,). But 

the polynomial f is irreducible and deg f = deg y, and so 0(x1),...,0(an) is 

a permutation of the numbers 21,...,%,. Thus to each root x; there corre- 

sponds, uniquely, a root 2,, i.e., 2; = 0(a;). 

Consider all the possible cycles x;, 0(a;),07(x;),...,0°~1 (x4), where p > 0 
is the first integer such that 0?(2;) = 2;. Clearly, any two cycles, considered 
as sets, either do not intersect or coincide. It only remains to prove that the 
length of all the cycles is the same. Let p be the least length of the cycles. If 
6?(x) = x for all x, then all the cycles are of length p. If 6?(a) #4 x, then the 
equations 6?(x) = x and f(x) = 0 have acommon root. From the irreducibility 
of f it follows that 6?(a;) = a; for all i = 1,...,n, and hence all the cycles 
are of length p. 

b) To avoid cumbersome notations, we assume that p = 3 and m = 4. In 
the general case the proof is the same. The table of roots can be expressed as 
follows: 


10, T11 = A210), Li2 = 0? (210). 
Let q be an arbitrary symmetric polynomial in three variables over k. Then 
(21) = 4 (a1, (a1), 0"(21)) = @ (O(a1), (a1), #1) = (a2). 
Similarly, y(71) = y(x3). Thus, if 
ela; =a (ae, O(a). 0" (a) 
then 


Y(r1) = 9(t2)=9(t3) =H, 9(t4) = Y(r5) = Y(xt6) = @, 
y(t7) = y(xg) = v(x) = gs, ~p(t10) = (211) = (#12) = ga. 


206 5 Galois Theory 


12 
Hence qi + q2 + 43 + 44 = 3 >. (i) € k. Considering the functions q?, q?, q* 
i=l 

instead of q, we see that )~ q?, )> q°, )> q* € k. This means that the coefficients 
of the polynomial [[(y — qi) lie in k. 

If r is one more symmetric polynomial in three variables, we can consider 
symmetric polynomials rq! for 1 = 0,1, 2,3. Now, determine the r; in the same 
way as the q;. Then the system of equations 


rig | roqh r3qh | raqi = R;, where! =0,1,2,3 and R; € k, 


shows that r; = &(q;), where @ is a rational function (the same for all 7). 
Let q=t) +to+ ts, r = tite + tot3 + t3t, and r = tytats3. Then 


a= + £244 v3 = Yi 


(a root of [](y — qi)), and 


ry = @1Xq + LQu3 + 1321 = O(y1) 


and 7, = 11%2%3; we denote: 7, = Py). Therefore 71, 22,23 are the roots of 
the equation 
a — ya? + G(y1)x — O(y1) = 0, 


where @ and @ are rational functions. This equation is a cyclic abelian one 
since 22 = 0(x1), v3 = 67(21) and 2 = 63(z1). 


Theorem 5.3.5. a) Any cyclic abelian equation is solvable by radicals. 
b) The solutions of a cyclic abelian equation of order n = pm reduces to 
the solution of two cyclic abelian equations of orders p and m. 


Proof. a) Let f be an cyclic abelian polynomial of degree n with roots 
1,.--,%, and let ¢ be an nth primitive root of unity. Let @ be a rational 
function such that 2,4, = 0*(x#,) and x; = 6"(x,). Consider the Lagrange 
resolvent 


(e", a1) = 2 + €70(x1) + 67767 (21) ++» Fe VIGC™-Y (zr), 
Since r,41 = 0*(2x1), it follows that 
(€", p41) = 6* (x) + eT OP*1 (1) + e*"gF*2 (a1) +.--= ye et p**4 (x1), 


eau 


Clearly, 6*+8 (x1) = x1 for s = n—k, and therefore (€”, 2441) = € é", a1). 


In particular, (€", 7,41)” = (€",x1)". Hence 


(€", 21)" = (€", 49)" = ++ = (€",an)" = = (C7 i)" = wr (6), 


nm 


where wu, is a rational function. 
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Thus 


ay + €O(21) + 67°67 (21) +--+ Fel Yr9@™-D(z,) = %/u,(e) (x) 


forr=1,...,n—1. Further, for r = 0, we see that 71 +42 +---+2, = —a1, 
where a1 is the coefficient of z”~' in the polynomial f. Adding up all the 
equalities («) for r = 0,1,...,2—1 we see that the sum of coefficients of x1 


is equal to n, and the sum of coefficients of 6" (x1), where 1 <m <n-—1, is 
equal to 


Cee eres ee a, 


Thus, 
nz, =a, + V/ur(e)+---+ VWun-1(e). 


k 


If we multiply the rth equality («) by e~*" we similarly deduce that 


Nipy1 = a, +e" &/ ui(e) +++ eg ea Un—1(€). 


It is not difficult to obtain a slightly more precise expression: 


NEppr = a1 + y+ Aoy* +--+ + Any", 


where y = e~* ¥/uj(e) and Ag,...,An—1 are constants which rationally de- 
pend on e. Indeed, 


(te) ere a) em) Ve) 


(€,0n41)" (e™-F(er, 21)!" (e021)? (vay = A,. 


This means that A, depends rationally on ¢ and symmetric functions in 
B15+++,0n- 

b) To avoid cumbersome notations, we assume that p = 3 and m = 4. In 
the general case the proof is the same. Let 


Y= @ + x5 + Lg =X + 64 (21) + 6° (x1), 
Y2 = £2 + 26+ 219 = O(21) + °(a1) + 0° (21), 
yg = 3 + 27 + 211 = 67 (21) + 09(21) + O(a), 


ya = 04+ "94+ 212 = 63(21) + 07 (21) + O14 (2). 


In the situation considered, the conditions of Theorem 5.3.4 (with 6 replaced 
with 6“) are satisfied, and so x1, 6*(2;) and 6°(21) are the roots of a cyclic 
abelian equation of order p = 3 whose coefficients are rational functions in 
yi. Further, yi, y2,y3 and y4 are the roots of an equation of degree m = 4 
with rational coefficients. It only remains to prove that this equation is cyclic 
abelian. 
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Let 
g(x) = (8(ar1) + O° (a1) + (21) (v1 + O4(@1) + (a1))’. 
Then qi(a1) = yoy}. Further, 
q(2s) = (%6 + 210 + @2)(@5 + tg + 21)’ = (a1). 


Similarly, qi(%9) = q(xs5). Similar arguments show that 


y3¥o = g(r2) = w(%6) = a (210), 
yays = q(x) = u(27) = qi(r11), 
yy, = a(v4) = qi(as) = ai(x12). 


Hence 
122 
you + ysus + yas + ys = zd ala) € k. 
i=1 
For the system of equations 


yoy +---+ yy) =), where | = 0,1,2,3 and T € k, 


we deduce that yo = y(y1), y3 = Y(y2), ya = Y(y3) and yr = y(ya). 


Corollary. Any cyclic abelian equation of degree 2" can be solved by 
quadratic radicals. 


5.3.5 The Abel-Galois criterion for solvability of equations of 
prime degree 


Evariste Galois perished in a duel and had no time to publish his main results 
in the study of the theory of solvability of polynomial equations by radicals. 
Still, he managed to publish some of his results. In a short notice in “Bul- 
letin des Sciences mathématiques” (1830) Galois communicated the following 
result: 


In order that an equation of prime degree be solvable by radicals it is nec- 
essary and sufficient that given any two of its roots the others would rationally 
depend on them. 


It is interesting to observe that in 1828 Abel wrote to Crelle that he found 
a criteria for solvability by radicals of the equation of prime degree. Abel 
formulated his criterion in almost the same words as Galois: “In any triple of 
the roots one root should be rationally expressed in terms of the other two”. 
No testimony, however, on the existence of Abel’s proof has survived. 
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Theorem 5.3.6. a) An irreducible over Q equation f = 0 of prime degree 
p is solvable by radicals if and only if its roots can be numbered so that any 
permutation o from the Galois group is of the form o(i) = ai +b (mod p), 
where a #0 (mod p). 

b) An irreducible over Q equation f = 0 of prime degree p is solvable by 
radicals if and only if all its roots can be rationally expressed in terms of any 
two of the roots. (In other words, if a1,...,Qp are all the roots of the equation, 
then Q(a1,...,Qp) = Q(ai,a;) for any distinct i and j). 


Proof. a) For the Galois group indicated, the multiplication law is of the 
form 
[a1, b1][a2, bz] = [a1a2, a1b2 + bi]. 


The solvability of this group was proved in the theorem on simple radical 
extensions (Theorem 5.2.9 on page 197). Therefore it remains to prove that, 
if an irreducible equation of prime degree p is solvable by radicals, then its 
Galois group consists of transformations of the form indicated. 

When proving theorems on equations solvable by radicals, we have shown 
that, for an equation f = 0 solvable by radicals, there exists a sequence of 
radical extensions of prime degrees 


boi oT y s+ 5b, = ig) a iy, 


where € is an nth primitive root of unity, where n = [L : QJ, the extension 
LD Qis normal and L contains the splitting field N for f. 

In this scenario, G(N,Q) = G(L,Q)/G(L,N). Therefore it suffices to 
prove that any automorphism of LZ over Q permutes the roots of f in the 
manner indicated above, i.e., the root number 7 is replaced by the root num- 
ber ai+ b. 

We may assume that L’_, does not contain all the roots of f. The proof 
of the theorem is based on the fact that f is irreducible over L/,_, and the 
degree of the extension L D Li_, is equal to p. This means that, until the 
very last radical extension, the polynomial f remains irreducible whereas at 
the last step it factorizes into linear factors. The statement required obviously 
follows from the next lemma. 


Lemma. Let f be irreducible over a field k which contains a qth primitive 
root of unity, where q is a prime. Further, let K = k(3), where 64 € k. Then, 
over K, the polynomial f is either irreducible or factorizes into q irreducible 
factors of equal degree. 


Proof. Let L be the splitting field for f over k. The field k contains a qth 
primitive root of unity, and so L() is the splitting field of f over k(). 

The sequences of extensions L(3) > k(3) > k and L(3) > LD k yield the 
exact sequences 
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0 — G(L(8), k(8)) > G(L(9),k) > G(k(3), k) > 0, 


0— G(L(8),L) — G(L(8),k) > G(L,k) 0. 
Therefore 
|G (L(8), k(8))| - |@ (&(8), k)| = |@CL, k)| - |G (L(4), D)]. 


By Theorem 5.2.5 on page 195, the groups G(L((),k(G)) and G(L(8), L) 
are subgroups in G(L,k) and G(k(@), k), respectively. Moreover, by Theorem 
5.2.9 (b) on page 197, the groups G(L((), L) and G(k(G),k) are subgroups 
in F,. By the hypothesis q is a prime, and so Fy has no nontrivial subgroups, 
and therefore 


[G(L, k)  G (L(3), k(8))] = 1 or @, 


where the degree of the extension is equal to q only if G(L(@), L) is trivial, 
ie, G(L(8),k) = G(L,k). In this case G (L(G), k(@)) is a normal subgroup 
in G(L, k) of index g. 

Recall that a polynomial (without multiple roots) is irreducible if and only 
if its Galois group acts transitively on the set of its roots (Theorem 5.2.6 on 
page 195). Hence we only have to consider the case when H = G(L((3), k(8)) 
is a normal subgroup of G = G(L,k) of index q. In this case G/H = F,, and 
so 

G={H, gH, gH, ..., 9% |H} for some g €G. 


Let a1,...,Qn be all the roots of f. We define the set Ha; = {h(a;) | h € H}. 
Then Ha; and Ha; either coincide or do not intersect. Therefore the sets Ha, 
and gHa, = Hg(qaj) either coincide or do not intersect. 

In the first case, the group H transitively acts on the set of Roots, whereas 
in the second case the set of roots splits into q subsets of equal cardinality on 
each of which the group H transitively acts. Let n = rq and let H transitively 


act on the set a;,,...,Q4,. Then H preserves all the coefficients of the poly- 
nomial (a —aj,)++++: (x — a,,.), and hence this polynomial is irreducible over 
k((). 


Thus, f is irreducible over L/,_, and it factorizes into linear factors over 
Li, = L'_,(8), where 64 € Li._,. This means, in particular, that q = p. 
Therefore L = L’ is a cyclic extension of L’_, of degree p and L’_, contains 
a pth primitive root of unity. Hence G(L, Li_,) =F). Let o be a generator of 
this group. The roots a1,...,a@, of f can be numbered so that o(7) =i+1, 
ie., a sends a; into aj41. 

The group G(L, L/,_,) is a normal subgroup of G(L, Li _,). Let 7 be an 
arbitrary element of G(L, L’_,). Then ror~! € G(L,L'_,), ie., raT~* = 0 
for some a. This means that to(7) = o%7(4), i.e., T(i+1) = T(t) +a. Therefore 
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T(t) = 701) + G@- Da=ai+ (711) —a) = ai +8, 


where b = r(1) — a. The element 7 is of the form required. 
Let us prove now that if all the elements of G(L,L/,) are of the form 


T(i) = ai + b, then all the elements of G(L, Li,,_,) are also of the same form. 
In the proof we use the fact that G(L, L/,) contains an element such that 
o(t) =i+1 and G(L,Li,) is a normal subgroup of G(L,L',_,). Let ps be 
an arbitrary element of G(L, L/,_,). Then wou! € G(L, Li) so wou + (i) = 
ai +b. For j = p~1+(i), we have yo(j) = ap(j) +b, ie, u(j +1) = ap(j) +b. 
Let us prove, first of all, that a = 1. Indeed, 


(2) = api(1) Tv b, 


wi) = oF" (1) + (a7? + oF +.» tat 1b. 
Hence, (1) = p(p + 1) = aPy(1) + (aP-1 + aP 7 +--- +04 1)), ie., 
(1 — a?)u(1) = (a? + + a?-7+---+a+1)b (mod p). 


Let us multiply both sides of the last congruency by 1 — a and use the fact 
that a? =a (mod p). As a result, we obtain 


(1—a)?u(1) =(1—a)b (mod p). 


If a# 1 (mod p), then p(1) = (1 — a)~'b (mod p). But then the same argu- 
ments show that (2) = (1 — a)~'b (mod p) which is impossible. 

b) If an irreducible equation of prime degree is solvable by radicals, then 
its Galois group consists of transformations of the form i + ai+b. Any 
transformation of such a form with two fixed points is the identity. This means 
that, after adjoining two roots a; and aj, the Galois group reduces to the 
identity transformation, i.e., all the roots belong to Q(a;, a;). 

Now suppose that for any two distinct roots a; and a; the field Q(a;, a;) 
contains all the other roots. This means that if the transformation from the 
Galois group fixes two roots, then it fixes all the other roots, i.e., any non- 
identical transformation has no more than one fixed point. 

The Galois group G transitively acts on the set {a1,...,a,} of p elements, 
and so |G| is divisible by p. 


Lemma. If the number of elements of the group G is divisible by a prime 
p, then G contains an element of order p. 
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Proof. Let |G| = n = mp. We use induction on m. For m = 1, the 
statement is obvious. If G has a proper subgroup H whose index [G : H] is 
not divisible by p, then |H| is divisible by p and we can apply the inductive 
hypothesis. Therefore we may assume that the index of any proper subgroup 
is divisible by p. 

For any x € G, consider the subgroup Nz = {g € G | gxg~! = x} and 
the class of conjugate elements G, = {gxrg~! | g € G}. Clearly, |G,| = 
[G : N,]. The conjugacy classes either do not intersect or coincide, and so 
n=n,+---+ns, where n; is the number of elements in the ith conjugacy 
class. By the hypothesis n; is either equal to 1 or divisible by p. If n; = 1, 
then the corresponding element x commutes with all the elements of G, i.e., 
belongs to the center Z(G). The number of the n; equal to 1 is divisible by 
p and distinct from zero (since n; = 1 corresponds to the identity element). 
Hence Z(G) is an abelian group whose order is divisible by p. Therefore Z(G) 
has an element of order p. 


Any element o of order p in G C Sy, is a cycle of length p. Renumbering 
the roots, we may assume that o(i) = i+ 1. Let us show that o generates a 
normal subgroup in G. Let tT € G. Then tor~! ¥ id and (rar~!)? = id, ice., 
tot | is a cycle of length p. Now define a(i) by the formula 


tot *(i) =i + a(i) = 0% (i). 


A cycle of length p in the group S, cannot possess fixed points, and so a(i) 4 0 
for all 7. Therefore the function a(z) on the set of p elements takes not more 
than p — 1 distinct values, and hence it assumes a certain value a at two 
distinct points, say i and j. This means that the transformation o~*taT~! 
has two fixed points. But we know already that any transformation from G 
with two fixed points is the identity. Hence tot~1(i) = 0%. 

The equation tat~1(i) = o% implies that 7(i) = ai+b, where b = r(1)—a. 
The group consisting of elements of this form is, as we know, solvable. 


Corollary. If an irreducible over Q equation of prime degree p > 3 is 
solvable by radicals, then the number of its real roots is equal to either 1 or p. 


Proof. The number p is odd, and so any equation of degree p has at least 
one real root. If an equation of prime degree p which is solvable by radicals 
has two real roots a; and aj, then all its roots belong to Q(a;,a;) C R. 


5.4 Calculation of the Galois groups 


5.4.1 The discriminant and the Galois group 


Theorem 5.4.1. Let f € Z[x] an irreducible monic polynomial of degree n. 
The Galois group of f over Q is contained in the alternating subgroup An C Sy 
if and only if the discriminant D(f) is a perfect square. 
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Proof. Let a1,...,Qn be all the roots of f. Then D(f) = [T] (ai — a,j). 
i<j 
Set 6 = 6(f) = [[ (a; — a;). If o € Go(f), then 


i<j 
0(5) = [J (as) — o(ay)) = (-1)76. 
i<j 
By the hypothesis f is irreducible. Therefore, in particular, 6 4 0. Hence all 
the automorphisms o € Ge(f) preserve 6 if and only if Ge(f) C An. On the 
other hand, all the automorphisms o € Ge(f) preserve 6 if and only if 6 € Q. 


Example. Let f(x) = x? + ax? + ba +c be an irreducible polynomial over 
Z and D its discriminant. Then Go(f) = A3 if VD € Q and Go(f) = 93 if 
VD¢EQ. 


Proof. Since f is irreducible, its Galois group Ge(f) is transitive. In Ss, 
there is only one transitive group distinct from $3, namely, A3. 


5.4.2 Resolvent polynomials 
Let y € Qim,...,2%n] and 
Gy={cES,|ov= Ges Clty << 5 Pein) =P @iys-+5 0a) p 
Under the action of S,, we obtain from vy distinct functions 
Pli=Y, Y2=T2,---,Ym =Tmy, where m = |S,|/|Gyl. 


Example 5.4.2. If p = 21 %2 + #324, then Gy consists of the identity element 
and the permutations 


(12), (34), (12)(34),  (13)(24),  (14)(23), (1824), (1423). 
So in this case yg = 41%3 + Go%4 and ys = 2124+ LK. 


It is easy to verify that Gy, = TiGeT Indeed, the equality oy; = y; is 
equivalent to ot; = Tip. Therefore tT oT; € Gy,, ie, 7 € TiGo,T) 

Let f(z) = 2" + dn_ia"-1 +-+++ a9 be a polynomial with integer co- 
efficients and y € Z[a1,..., 2]. Define the polynomials y1,...,m and the 
respective groups Gy, as above. The resolvent polynomial (of y and f) is the 
polynomial 


Res(y, f)(#) = ]] (@ — vi(a,-..,an)), 
i=1 
where aj,...,@n are all the roots of f. 


Since the resolvent polynomial Res(y, f) has integer coefficients, one can 
calculate it by approximately computing the roots of f (the coefficients of the 
resolvent polynomial are calculated with sufficient accuracy and rounded up 
to the nearest integer). 
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Theorem 5.4.3. Let the resolvent polynomial Res(y, f) have no multiple 
roots. Then the Galois group of f over Q is contained in the group conju- 
gate to Gy if and only if Res(y, f) has an integer root. 


Proof. First, suppose that 
Go(f) C TGyr* = Gg,, 


where y; = Ty. Then the number y;(a1,...,@n) which is a root of Res(y, f) 
is preserved under the action of all the transformations from the Galois 


group, and therefore is a rational number. But the numbers aj,...,@, are 
algebraic integers and the coefficients of the polynomial y; are integers, so 
pi(ai1,...,Q@,) is an integer. 


Now suppose that y;(a1,...,Q@n) is an integer. By the hypothesis the re- 
solvent polynomial has no multiple roots. This means that, if 


oyi(an,. ss 5 On) = pj(a1,. os ,An), 


then 7 OT € Gy. Any element o of the Galois group preserves the integer 


pi(ai,..-,Qn), and hence, for this 0, we have i = j, i.e., 0 © %GyT; 


Making use of Theorems 5.4.1 and 5.4.3 we can easily calculate the Galois 
group of any irreducible polynomial of the form f(x) = 24 + aya? + agx? 4 
a3x + a4 with integer coefficients. First of all, observe that, up to conjugation, 
S4 contains only the following transitive subgroups: 

1) The whole group 54. 

2) The alternating group Ay. 

3) The dihedral group of order 8 described in Example 5.4.2. (We denote 
this group by D4.) 

4) Klein’s Viergruppe of order 4 consisting, apart from the identity, of the 
permutations (12)(34), (13)(24) and (14)(23). (We denote this group by V4.) 

5) The cyclic group F4 generated by the cycle (1234). 

There are the following embeddings: Vz C D4 Aq and Fy, C Dy, but 
Fy ¢ Ag. 

Let us start our calculation of the Galois group with the calculation of 
the discriminant D of f and the resolvent polynomial Res(y, f)(x) for y = 
4122 + £324. Easy calculations show that 


Res(y, f)(x) = 2? — agz” — (a1a3 — 404)" — aga? — 4agag — a2. 
It is also easy to verify that Res(y, f)(2) has no multiple roots. Let, e.g., 
A102 +aza4 = aya3 + A224. 


Then (a; — a4)(a2 — a3) = 0. But f has no multiple roots, and so a, 4 a4 
and ag x a3. 

If the resolvent polynomial has no integer roots, then the Galois group is 
equal to either S4 or Ay. One can distinguish these groups with the help of 
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D: if D is a perfect square, then the Galois group is equal to A4. Otherwise, 
it is equal to S4. 

If the resolvent polynomial has an integer root, then the Galois group is 
equal to F4, V4 or D4. Of these groups only V4 is contained in Ay. So if VD is 
an integer, then the Galois group is equal to V4; otherwise it is equal to either 
Fy or D4. 

One can distinguish which is actually the case — that of F, or D4 — with 
the help of the resolvent polynomial corresponding to 


2 2 2 2 
P= U1XQ + LQX3 + L304 4+ L4X}, 


for which G, = Fy4. But in this case the resolvent polynomial (of degree 6) 
may have multiple roots. For example, the equality 


2 2 
a1a5 t A203 t a304 t asa; = a,a3 t a204 + A4Q3 + AZQ7 


is equivalent to the equality 
(a1 — a2)(a3 + a4) + A304 = 0. (1) 


But if we replace each root a; with a; + a, then, for some a, identity (1) will 
be violated. 

In the general case one can get rid of the multiple roots of the resolvent 
polynomial applying a more complicated Tchirnhaus transformation. This 
is a useful practical device (simple calculations but does not always work), 
whereas theoretically (difficult calculations but always works) one can use the 
following statement. 


Theorem 5.4.4. Let f € Z[z] be a polynomial of degree n without multiple 
roots and let G C S,, be an arbitrary subgroup. Then there exists a function 
p € Z[a1,...,2n] such that Gp = G and the resolvent polynomial Res(y, f) 
has no multiple roots. 


Proof. Let a1,...,Q@n be all the roots of f. We have shown in the con- 
struction of the Galois resolvent that there exist integers mj ,...,7™%, such 
that the numbers m,a¢(1) +++ + MnQg(n) are distinct for any 7 € S,. We 
define 

W(t, 21,...,2n) = II (fim — =" — iy dem): 
o€G 


For any 7 € S;,, consider the polynomial 
PW Hi yareptia) = OE elias oy Bein) 


The polynomials 77) and 72 are distinct if and only if they are distinct as 
polynomials in t for the fixed values 71 = a1,...,%pn = Qn. 

Under the action of S,,, we construct from the function ~(t,71,...,%n) = 
w distinct functions Ww, = w, we, ..., Wm, and the polynomials in t of the 
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form y(t, Q1,---,;Qn),---, Um(t, @1,-.-, Qn) are also distinct. Therefore there 
exists to € Z for which the numbers w1(to,a1,---,;@n),---, Um(to, Q1,---,; An) 
are distinct. But then the polynomial 


p(a1,.--,;2n) = W(to, 1,.--,;2n) = 


Il (to — ™M1Qg1) 7 MnQg(n)) 
ceG 


is the one to be constructed. Indeed, if 7 ¢ G, the polynomials y(a1,...,@n) 
and ry(a1,.-..,@n) are distinct since they have distinct values at the point 
(@1,---,;%n) = (Q1,..-, Qn). 


5.4.3 The Galois group modulo p 


Let f(z) = 2" + aya"-!+--++ a, be an irreducible polynomial with integer 
coefficients. For any prime p, consider the polynomial f (mod p) over F, by 
replacing the coefficients of f by the corresponding residue classes modulo p. 
The polynomial f (mod p) can turn out to be reducible and the structure of 
its factorization into irreducible factors is closely connected with the structure 
of the Galois group of f over Q. We now study this connection and apply it 
to the calculation of certain Galois groups. 

We only consider the primes p for which the polynomial f (mod p) has no 
multiple roots, i-e., for which the polynomials f (mod p) and f’ (mod p) are 
relatively prime. The latter condition is equivalent to the fact that p does not 
divide R(f, f') = +D(f). 

In what follows we assume that D(f) is not divisible by p. 

The roots of the polynomial f (mod p) do not necessarily belong to F,, but 
there exists a finite extension of F, which contains all the roots of f (mod p). 
In order to construct such an extension, it suffices to show how to adjoin to 
an arbitrary finite field F, a root of an irreducible over F, polynomial h. It is 
easy to verify that the quotient ring 


K = F,|2}|/h(x)F [2] 


is a field. Indeed, if the polynomial g(x) € F,[z] is not divisible by h(x), then 
it is relatively prime to h(a), and so u(a#)g(x) + v(a)h(a) = 1 for some u(a) 
and v(x) € F,[x]. Hence u = g~! (mod h(x)). Clearly K = F,(a), where a is 
the image of x under the canonical projection. Clearly h(a) = 0, ie, a isa 
root of h. 

Having adjoined to F,, all the roots a1,...,@, of the polynomial f (mod p) 
we obtain a field F,(a1,...,@n) = Fpr. Let Gal be the Galois group of f over 
Q, and Gal, the group of automorphisms of F,(a1,...,@,) over F,. Any such 
automorphism sends a root a; into a root aj, and any permutation of roots 
uniquely determines an automorphism. Therefore, having numbered the roots, 
we may assume that Gal, is a subgroup of S,. Observe that the roots of the 
polynomials f and f (mod p) belong to distinct sets, and between the roots 
of these polynomials there is no natural one-to-one correspondence. 
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Theorem 5.4.5. a) Under an appropriate numbering of the roots of f and f 
(mod p) the group Gal, becomes a subgroup of Gal. 

b) Gal, is a cyclic group of order r = [F,(a1,...,Qn) : Fp]. 

c) If f (mod p) is the product of several irreducible factors whose degrees 
are 14,...,Nx, then Gal, contains the product of (non-intersecting) cycles 
whose lengths are n1,..., Nk. 


Proof. a) One can construct a polynomial similar to the Galois resolvent 
by replacing integers m1,...,™m, by the indeterminates u,,...,Un. Namely, 
let (1,...,@n be all the roots of f. Consider the polynomial 


F(z, 1, tee 5 tbe) = Il (x = U1B5(1) ee UnBo(n)) 
ocG 
and let G(x, u1,...,Un) be its irreducible over Z divisor divisible by 
L— Uy, — +++ — UnBn- 


In the same way as for the Galois resolvent, we prove that the Galois group 
Gal consists of the elements which correspond to the linear divisors of G. 

Over F,, the polynomial G (mod p) factorizes into irreducible factors Gy 
(mod p),...,G, (mod p). Any permutation of the roots a1,...,Q@, of the 
polynomial f (mod p) which belongs to Gal, sends G; (mod p) into the 
same polynomial G; (mod p). Therefore the same permutation of the roots 
GB1,-.-,8n cannot send G(#,u1,...,Un) into another irreducible factor of 
F(a,U1,...,Un) since F’ (mod p) has no multiple divisors. 

b) In the field of characteristic p, we have (a + y)? = x? + y?. Therefore, 
ifa ~ y, then «2? — y? = (a — y)? £0. Hence the map x +> a? is an automor- 
phism of the field F,,- which preserves the elements of F,. The powers of this 
automorphism send x to z?, xP’, ..., a?" = x. The field Fr is obtained from 
the field F, by adjoining a root ¢ of the polynomial 27~! — 1, where q = p”. 
Here (¢ 4 C° if0 <a <b<q-—1. Therefore the automorphisms 


2 r 
crea, cra, ..., trea =n 


are distinct. On the other hand, the degree of the extension F,r D F,, is equal 
to r, and hence ¢ satisfies an algebraic equation of degree r over Fp. Any 
automorphism of F,, over F, is uniquely determined by the image of ¢, and 
therefore the number of distinct automorphisms does not exceed r. 

c) The group Gal, is cyclic, and so it is generated by an element co. Let us 
represent this element as a product of non-intersecting cycles: 


o = (12... f(t] 0.0) (00. n). 


The group Gal, transitively acts on the elements of each cycle, and so the 
cycles correspond to the irreducible factors of f (mod p). 
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Example. The Galois group of the polynomial x° — 2 — 1 over Q is equal 
to Ss. 


Proof. Modulo 2, the polynomial considered factorizes into irreducible 
factors x2 + 2+ 1 and 2? + 2? 4+ 1, and therefore its Galois group con- 
tains an element of the form (i7)(klm), where {7, 7, k,1,m} is a permutaion of 
19, 44,5. 

Modulo 3, the polynomial x° —a—1 is irreducible. Indeed, if this polynomial 
were reducible, it would have had a factor of degree 1 or 2. The product of all 
irreducible polynomials of degree 1 or 2 over F3 is equal to x° —x (see Theorem 
3.3.7 on page 99). Therefore x° — 2 — 1 should have a common divisor either 
with the polynomial x° — x or with a polynomial x° + 2. Neither is possible. 
Therefore the Galois group contains the cycle (12345). 

The Galois group also contains the element ((ij)(klm))° = (ij). Consider- 
ing conjugations of (77) by (12345)*, we obtain the transpositions (i+a,7+a). 
For a = j — i, we consecutively obtain transpositions (7j), (jk), (Kl), (lm), 
(mi) which generate the whole group 55. 


Frobenius proved that, if the Galois group of an irreducible polynomial f 
of degree n contains an element representable as the product of cycles whose 
lengths are equal to n,,...,n,, then there exist infinitely many primes p for 
which the polynomial f (mod p) factorizes into irreducible factors of degrees 
N1,---,M~. He even calculated the density of such primes p. For Frobenius’s 
density theorem and its generalization — Chebotaryov’s density theorem — 
see [J], [Ch], [Al] and [Sel]. 


6 


Ideals in Polynomial Rings 


6.1 Hilbert’s basis theorem and Hilbert’s theorem on 
Zeros 


6.1.1 Hilbert’s basis theorem 


Hilbert’s basis theorem appeared in his famous paper [Hi2]. In this work he 
suggested totally new methods, using which he managed to prove the existence 
of a finite basis for the invariants of forms. Previously, in 1868, Gordan proved 
the existence of a finite basis only for binary forms, and this was performed by 
a very labour-consuming case-checking. Hilbert, on the contrary, managed to 
solve a number of central problems of invariant theory. His methods, however, 
were not constructive and this prompted Gordan to complain: “This is not 
mathematics. This is theology!” 

Let K be a field (for example, Q, R or C) or the ring Z. Let K[a1,..., 2p] 
be a polynomial ring in n indeterminates with coefficients in K. 


Theorem 6.1.1 (Hilbert). Let MC K[a,...,%n] be an arbitrary subset. 
Then there exists a finite set of polynomials m,,...,m, © M, such that any 
polynomial m € M can be represented in the form m = \ym, +--+: +Ar-M,r, 
where ; € K[x1,...,Xn]. 


It is convenient to formulate Hilbert’s theorem in terms of ideals. Then it 
will be easier to prove. 

A subset IC K[ax1,...,@n] is called an ideal if the following two conditions 
hold: 

lhabe T= at+bdel, 

2)aeEl, fe K[x,...,¢n] == fae l. 

For any set M C K[21,..., an], we can consider the ideal [(M) generated 
by M which consists of all sums of the form Aym; +---+A;-m,, where A; € 
Klay,...,%n] and m; € M. 

A collection {@q | da € I} is called a basis of the ideal I if any element 
a € I can be represented in the form a = Aida, +++: + At@a,, Where 4 € 
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K[a1,...,2n]. The ideal I is said to be finitely generated if it possesses a finite 
basis. 

To prove Theorem 6.1.1, it suffices to prove that the ideal [(/) is finitely 
generated. Indeed, in this case any element of M Cc I(M) can be expressed in 
terms of a finite collection of elements a;,...,a; € I, and each of these ele- 
ments by definition can be expressed in terms of a finite collection of elements 
of M. 


Theorem 6.1.2 (Hilbert’s basis theorem). Any ideal of K[x1,...,%n] is 
finitely generated. 


Proof. First, we observe that any ideal in the considered rings K is finitely 
generated. Indeed, if K is a field, then any nonzero ideal coincides with K and 
is generated by 1. If K = Z, then any ideal is of the form mZ and is generated 
by m (to prove this, consider the smallest positive element of the ideal). 

Let L, = K[x1,...,@n] for n > 1 and Lo = K. Then K[a1,...,¢n41] = 
L, [x], where x = @,41. As we have observed, for n = 0 any ideal of L,, is 
finitely generated. Therefore it suffices to prove that, if any ideal of L = Ly, is 
finitely generated, then any ideal I of the ring L[z] is also finitely generated. 


Step 1. The leading coefficients of the polynomials which belong to the 
ideal I C L{a], together with zero, constitute an ideal J in L. 


Indeed, let f(x) = az” +--+ and g(a) = ba™ +--+ be polynomials in J. 
We may assume that m <n. Then the polynomial f(«) + #”~™g(a) belongs 
to I and its leading coefficient is equal to a+ b. It is also clear that if A € L 
and A # 0, then the leading coefficient of Af is equal to Aa. 

Let us begin the construction of a finite basis of the ideal J in L[z] by 
selecting a finite basis a,,...,a, of the ideal J in L. The elements a,..., a, 
are the leading coefficients of some polynomials f;,..., f, € I. 


Step 2. There exists a positive integer n such that any polynomial in I 
is the sum of a polynomial whose degree is less than n and a polynomial of 
the form Ai fi +++: +Arfr, where A; € La]. 


Let us prove that we can take n to be the greatest of the degrees of the 
polynomials f;,..., f-. In J, take an arbitrary polynomial f(x) = az +--+ of 
degree N > n. By definition, a € J, and hence a = > \ja; for some A; € Lz]. 
Consider the polynomial 


g(a) = f(x) — So Aw F f, 


The coefficient of 2% in g is a— >> Aja; = 0, and hence degg < N — 1. If 
N—12>n we can repeat this construction, and so on. 


Step 3. There exists a finite set of polynomials g1,...,9s € I such that 
any polynomial in I of degree less than n can be represented in the form 
Aigi t+: +Asgs, where Ay € Lia]. 
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The coefficients of x”~' in polynomials in J whose degrees are less than 
n—1 constitute an ideal of the ring LD. Let b1,...,b, be a basis of this ideal and 
g1;---,9k the polynomials of degree n — 1 in I whose leading coefficients are 
b,,..., bx, respectively. In J, take an arbitrary polynomial h of degree n — 1. 
Let b be the leading coefficient of this polynomial. Then 6 = \,b; +---+Axbr, 
where \; € L{z]. Therefore the degree of h— A191 —-+-—Axgx does not exceed 
n — 2. Thus, up to elements of the ideal generated by gj,...,gx, we have 
replaced the polynomial of degree n — 1 by a polynomial of degree no higher 
than n — 2. We can similarly select polynomials g,41,...,g; so that, up to 
elements of the ideal generated by them, any polynomial of degree n — 2 is 
equal to a polynomial of degree no higher than n — 3, and so on. 


6.1.2 Hilbert’s theorem on zeros 


Hilbert’s theorem on zeros appeared in another famous paper by Hilbert on 
invariant theory [Hi4]. This theorem is sometimes called Hilbert’s theorem on 
roots. Its German name, Nullstellensatz, is also widely used in the English 
mathematical literature. 


Theorem 6.1.3 (Hilbert’s Nullstellensatz). Let 


fifty cog de © ClO1y. os 58); 


where f vanishes at all the common zeros of the polynomials f,,..., fr. Then, 
for a certain positive integer q, the polynomial f? belongs to the ideal generated 
by fi,--->fr, ue, fl=qfit---t+arfr for some gi,.--,g9r © Clai,..., Xn]. 


Proof. We first prove one particular case of Hilbert’s Nullstellensatz from 
which we can deduce the general case. Namely, consider the case when f = 1. 


Theorem 6.1.4. Let the polynomials fi,..., f, € C[a1,...,2n] have no com- 
mon zeros. Then there exist polynomials gi,...,gr € C[a1,...,@n] such that 


gfite+orfr = 1. 


Proof. Let I(fi,...,f-) be the ideal in K = C[{a1,...,2,] generated by 
fi,.--, fr. Suppose that there are no polynomials g),...,g, such that gif + 
---+9,f, =1. Then I(fi,...,f-) 4K. 


Step 1. Let I be a nontrivial maximal ideal of K, let ID I(fi,..., fr). 
Then the ring A= K/I is a field. 


It suffices to verify that any nonzero element in K/J has an inverse. If 
f € I, then I+ fK is an ideal strictly containing J, and sol + fk = K. This 
means, in particular, that there exist polynomials a € I and b € K such that 
a+bf =1. Then the class 6 € K/I is the inverse of f € K/I. 
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Let a; be the image of x; under the canonical projection 
p:Cla,...,2n] > Clri,...,an]/T = A. 


Then A = C[aj,...,Q@,]. Therefore A is a finitely generated algebra over C 
which at the same time is a field. 


Step 2. If a finitely generated algebra A = Clay,...,Qn] over C is a 
field, then A coincides with C. 


We need the following auxiliary statement. 


Lemma. In A = Clai,...,Qn], there exist elements yi1,..., yr, alge- 
braically independent over C, such that any element a € A satisfies a normed 
algebraic equation over Cly1,.--, Yk], ue; 

a’ +bya'-1+.--+b;=0, where by,...,b,€ Clyi,---; Ye]. 

Proof. We use induction on n. If the elements aj,...,@,, are algebraically 

independent, the statement is obvious. Let f(a1,...,Q@n) =0 be an algebraic 


relation between them. If f is a polynomial of degree m whose coefficient of 
x does not vanish, then 


a + ba 1 4+ + bm =0,  b1,...,6m € Clay,...,Qn—1]. 


It remains to use the induction hypothesis. 
If the coefficient of x?” is zero, we make the change of variables 


In = nh and x; = + a;€, fori=1,...,n—1. 
Let us try to select the numbers a; € C sp that the polynomial 
961, -+-,€n-1,€n) = f(21,.--, En) = f(r + ai€n,---En—1 + Gn—18n, Sn) 
has a nonzero coefficient of £7". This coefficient is equal to 
Gm(0,..-,0,1) = frn(a1,---,@n—1, 1), 
where fm, and gm are homogeneous components of the highest degree of f and 


g, respectively. Clearly, the nonzero homogeneous polynomial fi,(71,...,%n) 
cannot be identically zero for 7, = 1. 


Now we can concentrate on the proof that A coincides with C. Select the 
elements y1,...,y¥~ € A as in Noether’s normalization lemma. Let us show 
that any nonzero element «x is invertible in B = C[y1,..., yx], i-e., B is a field. 
By assumption A is a field, and so is invertible in A. Moreover, by Noether’s 
lemma the element x~! satisfies the equation 


(a—*)! + dy(a-1)- 1 4. +5, =0, 4,...,b,€ B. 
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Multiplying both sides of this equation by 2'—! we obtain 


—— by — box — ++ ba} EB. 


The field B = Cly,..., yx] is a ring of polynomials in & indeterminates 
over C, but for k 4 0 the polynomial ring cannot be a field. Hence B = C. 
Any element of the field A is a root of a polynomial 


(a-1)) + bi (a1) 1 +. +b, b1,...,6:€ B=C. 
Therefore A = C. 
Step 3. The polynomials f,,..., f, vanish at the point 
(a1,---,Qn) € C”. 
Indeed, under the canonical projection 
p: C[z1,..., fn] > C[r1,...,2%n|/f =A=C 


the element x; transforms into a; € C, and so the polynomial y(a1,...,2%n) 
transforms into y(a1,...,@n). Since the polynomials f;,..., f; belong to the 
ideal J, the canonical projection annihilates them. 

Thus, having assumed that I(fi,..., f-) # C[a1,...,2n], we deduce that 
fi,.--,f- have a common zero. This contradicts the assumption of the theo- 
rem. 


Following [Ral] let us show now how to deduce the general Hilbert’s Null- 
stellensatz from Theorem 6.1.4. For f = 0, the statement is obvious. We 


therefore assume that f 4 0. Let us add to the indeterminates x1,...,%n a 
new indeterminate x,4, = z and consider the polynomials f),..., f-,1— zf. 
They have no common zeros, and so 
L=Aifit-:-+hefpt+h(l—zf), 
1 
where hy,...,h;,h are some polynomials in 2,...,%p,z. Set z = Fi After 


reducing to a common denominator we obtain 


FP =i fist Geto: 


where gi,..-.,gr are some polynomials in 21,...,% . This is a relation of the 
form required. 


Remark. If the coefficients of the polynomials f, f;,..., f; are real and 


f vanishes at all the common complex roots of f1,...,f;, then there exist 
polynomials gi,..., 9, with real coefficients such that f? = gi fi +--+ + 9rfr. 

Indeed, by Hilbert’s Nullstellensatz the equality f4 = hi f; +---+ hrf, 
holds for some polynomials hy,...,h, with complex coefficients. Let h; = 


9; + %p;, where g; and p; are polynomials with real coefficients. But then 
ftanfit-:++orfr- 
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Any set of homogeneous polynomials has a trivial common zero — the 
origin. Therefore, for homogeneous polynomials, an analogue of the set of 
polynomials without common zeros is the set of polynomials without common 
nontrivial zeros. 

For homogeneous polynomials the following analogue of Theorem 6.1.4 
holds. 


Theorem 6.1.5. Let Fi,...,F, € Clai,...,@n] be homogeneous polynomials 
without common nontrivial zeros. Then the ideal I(F,,...,F,) generated by 
them contains all the homogeneous polynomials of degree d > do, where do is 
a fired number. 


Proof. By assumption the only common zero of the polynomials is the 
origin. Therefore the linear polynomials x1,...,7, vanish at all the com- 
mon zeros of the polynomials F\,..., Fi. By Hilbert’s Nullstellensatz x?‘ € 
I(F,,...,F;,) for some p;. Set do = (p1 — 1) +--+ + (pn — 1) +1. Then any 
monomial Xqg = xv{1-...-a%" of degree d = a, +-++-+ Gn > do is divisible by 


n 


xi for some i. Therefore XqEel(Fi,...,F;,). 


A simple direct proof of Theorem 6.1.5 is given in the paper by Cartier 
and Tate [Ca3]. 


6.1.3 Hilbert’s polynomial 


Let us first recall certain definitions from commutative algebra. A module 
over a ring A is an Abelian group M on which A acts, i-e., for any a € A and 
m € M there is determined an element am € M such that, for any a,b € A 
and m,n € M, we have 


a(m +n) =am-+an, (ab)m = a(bm), 


(a+b)m=am+bm, Im=m. 


For example, if A is a field, then a module over A is just a vector space over 
A. 

A module M over a ring A is said to be finitely generated if any element 
m € M can be represented in the form m = )>a;mj;, where m,,...,™Mn is a 
fixed finite set of elements of M. 

A ring A is said to be graded if it is of the form A = & A;, where the A; 


are the additive subgroups of A and A;A; C Aj+;, and where A;Aj; consists 
of sums of elements a;a; such cha ai € a a; € Aj. A graded odde over a 


graded ring A is a module M = é M;, where the M; are additive subgroups 


of M such that A;M; C Mj+,;, at where A;M; consists of sums of elements 
aim, such that a; € ae m; © M;. 
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In this section the main example of a graded ring is A = K[20,...,2n], 
where K is a field; the homogeneous component A; consists of all homogeneous 
polynomials of degree 7. 

An ideal I Cc K[ao,...,2n] is said to be homogeneous if all the homo- 
geneous components of any element of J also belong to I (the homogeneous 
component of degree i of a polynomial f € K[ao,..., 2p] is the sum of all its 
terms of degree i). It is easy to verify that I is homogeneous if and only if it 
is generated by homogeneous polynomials f1,..., f;. Indeed, if I is homoge- 
neous, then the homogeneous components of the polynomials that generate I 
lie in J and generate it. If the ideal J is generated by homogeneous polyno- 
mials f;,..., fx, then any element g € J can be first expressed in the form 
g = dS hefa, and then one can represent every polynomial ha as the sum of 
its homogeneous components. As a result, every homogeneous component g; 
of g will be represented in the form g; = }> xg fg, and so g; € I. 

An ideal J is homogeneous if and only if it can be represented in the form 


I= a I;, where I; = IN A;. Therefore the quotient ring M = A/T is a graded 
i=0 


module over A whose grading is induced, i.e., is given by M; = A;/(IN Aj). 

Let us elucidate this. Let g,h € A = K[ao,...,2n] be some polynomials, 
and g; and h; their homogeneous components of degree 7. The classes g + I 
and h+ I coincide if and only if the classes g; + 1M A; and h;+1M A; coincide 
for all 7. Therefore 


M=A/I= 8 A,/(INA,) = & M,. 
1=0 i=0 


The action of A on M is as follows: for g € A and f +J € M, the element 
g(f +1) € M is defined as gf + I € M. Clearly, we have 


Aj(Aj/(IN Aj) C Ai4s/(10 Ai4s). 


Theorem 6.1.6 (Hilbert). Let K be a field A= K[x0,...,%p] and let M = 
M, be a finitely generated graded A-module. Then there exists a polynomial 
i=0 


+ —| 
pu(t) of degree < n such that for all sufficiently large i the dimension of Mj, 
as a vector space over K, is equal to p(t). 


Proof. We use induction on n. The starting point of the induction is 
n= -—1,i.e., A= K. In this case, M is a finite dimensional vector space over 
K, and so M; = 0 for sufficiently large 7. Therefore pay = 0. 

Now let n > 0 and suppose that the statement holds for the modules over 
the ring A’ = K[xo,...,%n-1], where A’ = K for n = 0. Set x = 2, and 
consider the A-modules M' = {m € M | am = 0} and M” = M/aM. These 
modules are finitely generated over A and annihilated by multiplication by 
x, i.e., 2M’ = 0 and xM” = 0. Hence M’ and M” are finitely generated A’- 
modules. Therefore, by the induction hypothesis, for sufficiently large 7, we 
have dim M/ = p; (i) and dim M/’ = po(i), where p; and p2 are polynomials 
of degree not higher than n — 1. 
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For every positive integer 7, we have the exact sequence 


0 > M! > M; 3 Misi > M!’ > 0, 


where the map M; a M;+1 is the multiplication by x. Therefore 
dim M; — dim M; + dim M41 — dim Mj, =0, 
i.e., 
dim Mj41 — dim M; = dim Mj. — dim Mj. 
For sufficiently large i, we have 
dim M;!,, — dim Mj = po(i + 1) — pi(t) = (3), 


where q(z) is a polynomial of degree no higher than n — 1. 

Let f(¢) = dim Mj. For sufficiently large 7, we have f(i +1) — f(t) = q(2), 
where q is a polynomial of degree no higher than n — 1. Therefore f is a 
polynomial of degree no higher than n for sufficiently large 7. Set 


a) = x(¢—1)-....(e-m+1). 
It is easy to verify that 
(a +1) — ™ = mal, 

The polynomials 2 = 1,2¢,...,2-) constitute a basis of the space of 
polynomials of degree no higher than n — 1, and so we can represent q in the 

n-1 n—-1 
form q(x) = >> a,x). Thus, the polynomial fo(x) = >> Beet) satisfies 

s=0 s=0 
the relation fo(¢ + 1) — fo(t) = q(z). It is also clear that the function c(i) = 
f (2) — fo(¢) satisfies the relation c(i +1) — c(i) = 0 for sufficiently large i, and 


n—-1 
so c(i) = cis a constant. Therefore f(r) = >> Sealer) +c is a polynomial 


s= 


of degree no higher than n. 


The polynomial pjs(z) is called Hilbert’s polynomial of the module M. 
Clearly, this polynomial is integer-valued (see page 85), and so it can be 
represented in the form 


pai) =0(_") +a(,.' 4) Heed Big 


where Co,..-,Cm are integers and m < n. We assume that co 4 0. If M = 
Clao,...,%n]/I, where I is a homogeneous ideal, then, under certain natural 
restrictions, the numbers cp and m have the following geometric interpretation. 

To a homogeneous ideal J there corresponds an algebraic set V(I) in the 
projective space CP”, namely 
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V(D) = {(a0,.--,@n) € CP” | f(ao,.--,@n) =0 for any f € J}. 


The ideal J is said to be prime if fg € I implies either f € J or g € I. 
For a prime ideal I, the algebraic set V(J) is irreducible, i.e., it cannot be 
represented as a union of V(J,) and V(J2), where J; and J, are nontrivial 
homogeneous ideals. The restriction mentioned above is that J should be a 
prime ideal. Then m coincides with the dimension of the projective algebraic 
variety V(I) and co coincides with the degree of this variety (the degree of a 
variety of dimension m in CP” is defined as the number of intersection points 
of this variety with the generic subspace of dimension n — m, in other words, 
with almost all such subspaces). For the proof of this statement, see [Mu2]. 


Example 1. If M = C[ao,...,&n], then m =n and co = 1. 


Indeed, M; consists of homogeneous polynomials of degree 7 in n + 1 in- 
determinates. We assign to a monomial ue -.-a'» the sequence consisting of 
ig zeros and one unit, followed by 7; zeros and one unit, and so on, and the 
sequence ends with 7, zeros. This sequence consists, therefore, of 7 +  num- 
bers among which there are 7 zeros and n units. The total number of such 
sequences is 


eee oa ae 


Therefore pyy (i) = dim M; = ( ) saps 


Example 2. If M = A/faA, where A = C[a,...,2n], and fa is a homo- 
geneous polynomial of degree d, then m = n — 1 and cp = d. 


Indeed, multiplication by fa yields the exact sequence 


‘es a gee > M; - 0, 


where H; is the space of homogeneous polynomials of degree 7 in n + 1 inde- 


terminates. In Example 1 it is shown that dim H; = (‘4”), and so 
i—d 
dim M; = dim H; — dim Hj_¢ = (") . (’ *") - 
n n 


= gr-l é [ i nm 
~~ (n—1)! ~ OF" an 


where co = dand m=n-1. 

Let M be a graded finitely generated A-module and pjy(i) = co(,,.) ++ °° 
its Hilbert’s polynomial. Then the number m = dim M will be called the 
dimension of M and co = deg M its degree. As we have already mentioned 


(and examples 1 and 2 support), if IC C[ao,..., 2] is a homogeneous prime 
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ideal and M = C[ao,...,%n]/Z, the number dim M is the dimension of the 
variety V(I) CCP” and deg M is the degree of this variety. 

Let us discuss now certain properties of the degree and the dimension of 
graded A-modules M which we will need in section 6.1.4. We are especially 
interested in the case when M = A/T, where I is a homogeneous prime ideal. 
In this case M is a ring without zero divisors, i.e., an integer domain. 

On the other hand, the homogeneous ideal I is prime if and only if the 
A-module M = A/T possesses the following property: if f ¢ I, then fm #4 0 
for m #0 (here f € A and m € M),. In the general case the A-module M is 
said to be integral if for any f € A either fM =0 or fm £0 for mF 0. 

In the rest of this section we will assume that A = K [%0, - ..+;Xn], where 


is a field and A is endowed with the natural grading A = & A, where A; is 


the set of homogeneous polynomials of degree 7. Let M be the graded finitely 
generated A-module such that dim M > 0, ie., pay 4 0. 


Theorem 6.1.7. Let M be an integral module and let f € Aq be a homoge- 
neous polynomial of degree d such that fM #0. Then 


dim(M/fM)=dimM-—1 and deg(M/fM) =ddeg M. 


In geometric terms, this statement looks as follows: a hypersurface of de- 
gree d singles out a subvariety of dimension m—1 and degree dr in a projective 
algebraic variety of dimension m and degree r. 


Proof. Let M' = {m € M | fm = 0}. The multiplication by f gives us 
the exact sequence 


0 M!_4— M,_a*~t M; > (M/fM); — 0. 


From the hypothesis of the theorem it follows that M’ = 0. Therefore, for 
sufficiently large i, we have 


Pm/pm(t) = pm(t) — pu (i — d). 


Let dim M =m and deg M = r. Then 


r m m 

— (i —(i-d)™)+--= 

_ rMd vn ip _ rd ma1 4 
ml! ~ (m-1) 


ie., papa (t) = dr(_,' 1) +--+ as was required. 
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A submodule S C M is said to be homogeneous if it is generated by 
homogeneous elements (i.e., the elements from homogeneous summands M;). 


An equivalent condition is that S = ® S;, where S$; = SM M;. The quotient 
i=0 
module M/S inherits the natural grading: M/S' = eS (M;/S;). 
i=0 

Theorem 6.1.8. Let p be a prime which does not divide deg M. Then deg M 
has a homogeneous submodule S such that the quotient N = M/S’ satisfies 
the following conditions: 

(a) dim N = dim M; 

(b) deg N is not divisible by p; 

(c) the module N is integral. 


In geometric terms this corresponds to separation of an irreducible com- 
ponent of the maximal dimension from an arbitrary algebraic set consisting 
of several components. 


Proof. Properties (a) and (b) hold for S = 0. In addition, M is finitely 
generated over A = K[zo,...,2n], and so any increasing sequence of submod- 
ules of M stabilizes. Therefore there exists a maximal homogeneous submodule 
S with properties (a) and (b). Let us show that this maximal submodule S$ 
also possesses property (c), i.e., the module N = M/S is integral. 

We have to prove that if f € A, then either fn = 0 for all n € N or 
fn #0 for all n £ 0. It suffices to prove this statement for any homogeneous 
f. Indeed, for an arbitrary polynomial we could then deduce the statement 
desired as follows. Let us decompose f and n into homogeneous constituents: 
f=Sstfepit-:: and n= np+niyi+:-:-, where f, 4 0 and n; 4 0. Suppose 
that fn = 0, Le., 


fsre =0,  fspinet+ forty. =9, fspone + fotinetit fsrize2 =0,... . 


Since nz 4 0, we consecutively get f;N = 0, fs4iN = 0, fsroN = 0,... 
Hence, fN = 0. 

Now, let f be a homogeneous polynomial of degree d and fN # 0. Set 
N'={neé N | fn=0}. Then multiplication by f yields the exact sequence 


0 Ni_y— Nia “$ Ni > (N/fN); > 0, 


and hence 
0—- Ni-a/Ni_a _ N; _ (N/ FN); — 0. 


Therefore, for large 7, we have 
pn(t) = pnypn(t) + pwn’ (i — a). (1) 


The condition fN # 0 means that fM+ 5 4S. Since S is maximal, the 
module 
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N/fN = (M/S)/f(M/S) = M/(S + fM) 


cannot simultaneously possess properties (a) and (b). Therefore either (i) 
dim(N/fN) < dim N = dim M or (ii) dim(N/fN) = dim M and deg(N/f N) 
is divisible by p. 

In case (i) by formula (1) 


Pyjni(i-d) =T +e, 


where n = dim N, r = deg N. 
In case (ii) 
(r —11)i” 


Pyjni(i — d) = ———— ++, 


where r; = deg(N/fN). Since r is not divisible by p and r; is divisible by 
p, it follows that in both cases dim(N/N’) = n = dim M and the number 
deg(V/N’), which is either r or r — 11, is not divisible by p. Therefore the 
module N/N’ possesses properties (a) and (b). The module N/N’ is of the 
form M/S", where S’ is a homogeneous submodule of M containing S. From 
the maximality of S it follows that S’ = S, i.e., N’ = 0, as was required. 


Theorem 6.1.9. Let I C A = K[a0,...,%n] be a homogeneous prime ideal 
and suppose that the dimension of the integral A-module M = A/T is equal to 
zero and its degree is equal to r £0, t.e., pu(i) =r #0. Then 

a) eM #0 for some x = 2;; 

b) L= M/(a%—1)M is a field which is an extension of degree r of K. 


Proof. a) If x;M =0 for j =0,...,n, then x; € J for all j, and therefore 
M = K. Thus, pys(t) = 0 which contradicts the assumption that pay (i) =r # 
0. 

b) Fix = a; such that cM 4 0. Then am ¥ 0 for m ¥ 0 since M is 
an integral module. This means that the map M 3 M is monomorphic. For 
sufficiently large 7, we have dim M; = pm (i) =r, and so dim Mj,, = dim Mj. 
Hence, for sufficiently large 7, the map Mj; = M41 is one-to-one. 

In M, consider a non-homogeneous submodule (a — 1)M. Let 


tT: M > M/(e@-1)M=L 


be the natural projection. Then 7(am) = m(m). Clearly, xx(m) = a(am). 
Hence L *3 L is the identity map. 

Let, for definiteness sake, MW; = M41 be one-to-one for 7 > a. Let us show 
then that L = 27(M,) = 7(Mai1) =-:- . Let us express m € M in the form 
mr, where m, € M,. Clearly, 


m=mot+-::: Mat Mat1++°* 


m™(mo Se Ma) = T(x" mo + at tm, free tf Ma) = m(m’), 


where m! = x%mo + «@~ lm, +--+ mq € Mg. Moreover, 


6.1 Hilbert’s basis theorem and Hilbert’s theorem on zeros 231 
— 7, _ ,k-a 
Ma+tl = L™Ma,15 Ma+2 = LT Ma,25 prt Mk= x Ma,k—ay 
where 1Mq,5 € Ma. Therefore 
1 
T(Mat1 +--+ + Me) = 7(Ma1 + +++ + Ma,k—a) = (mM), 


where m” = Mai +++: +™Ma,k—a © Ma. Thus the natural projection M, — L 
is onto. 

On the other hand, for obvious reasons, this projection is monomorphic: 
if (ma) = 0, then mg = (x — 1)m but no homogeneous element m, 4 0 can 
be represented in the form (a — 1)m. Therefore the projection M, — L is 
one-to-one and dim L = dim M, = r. 

In the situation considered, L is an r-dimensional algebra over K (i.e., a 
commutative ring and simultaneously a linear space over K’). Let us show that 
L has no zero divisors. Let I’, 1!’ € E and I'l” = 0. We have shown above that 
UV = 2(m’) andl” = r(m") for some m’,m” € Mg. Hence 0 = 1/1” = r(m'm"), 
where m’m” € Moq. For b > a, the projection M, — L is an isomorphism, 
and therefore m’m’ = 0. By assumption the ideal J is prime, and so in the 
ring M = A/I there are no zero divisors. Hence either m’ = 0 or m” = 0, ie., 
either l’ = 0 or 1” = 0. 

Now it is easy to show that L is a field, i-e., any nonzero element | € L 
is invertible. Indeed, the map x +> Iz, where x € L, is a linear map L — 
L with the zero kernel. In the finite-dimensional case, any such map is an 
isomorphism, and so, in particular, 1x = 1 for some x € L. 


6.1.4 The homogeneous Hilbert’s Nullstellensatz for p-fields 


The following statement was first proved in Hilbert’s paper [Hi4] though many 
mathematicians of 19th century already applied it, albeit without proper jus- 
tification. 


Theorem 6.1.10. Let K be an algebraically closed field, and, for n > 1, let 
A= K[xo,...,2n]. Then any homogeneous polynomials fi,..., fn € A have 
a common zero distinct from the origin. 


A similar statement holds also for so-called p-fields among which we en- 
counter in particular, the field of real numbers R. 

Let p be a prime. The field K is called a p-field if the degree of any finite 
extension of K is of the form p*. In particular, any algebraically closed field 
is a p-field for all primes p, and R is a 2-field. 


Theorem 6.1.11. Let K be ap-field, and let A= K|xo,...,@n], wheren > 1. 
Then any homogeneous polynomials fi,...,fn © A whose degrees are not 
divisible by p have a common non-trivial zero. 
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Proof. [H. Fendrich; see [Pf2], ch. 4] We use the results of the preceding 
section, namely Theorems 6.1.7-6.1.9. 

Let us start with the construction of a sequence of integral finitely gen- 
erated graded A-modules Mp = A, Mi = A/h,...,My = A/In such that 
dim M; = n—1, deg M; is not divisible by p and the homogeneous prime ideal 
J; contains the polynomials fo,..., f;. The module Mp = A satisfies these 
conditions since dim Mp = n and deg Mp = 1 (see Example 1 on page 227). 

Suppose that the modules Mo,...,M; (¢ > 0) are already constructed. 
Let us show how to construct M;+1 given M; = A/J; and f;,1. There are two 
cases to consider. 

Case 1: fiqi ¢ T;, i.e., fig. Mi x 0. Set 


Nigi = Mi/ fig. Mi = A/ (i + fig1 A). 
By Theorem 6.1.7, 


and 
deg Ni41 = deg fi41 deg M;. 


The number deg Ni+1 is not divisible by p since neither deg fi;1 nor deg M; 
is divisible by p. 

Case 2: fizi € I;. From the condition dim M; > 0 it follows that x ¢ Ij, 
ie., cM; #0 for some x = «;. Indeed, if xo,...,¢, € Li, then either M; = K 
or M; = 0, and so py, = 0. 

Set Nj41=M;/xM;. By Theorem 6.1.7, dim Nj.1 =dim M;—1 = n—(i+1) 
and deg Nji1 = deg M; since the degree of the polynomial x is equal to 1. 

In both cases we have obtained a module N;41 but it is not necessarily 
an integral one. To obtain an integral module, we use Theorem 6.1.8. By this 
theorem Nj, has an homogeneous submodule $;+1 such that the quotient 


Misi = Nigi/ Sign = A/Ti41 


possesses all the properties desired: dim Mj; = dim Niqi = n— (i + 1), 
deg M;+1 is not divisible by p and M;4, is integral; here [;41 is a homogeneous 
ideal containing fi,..., fi4i. 

The dimension of the last of the constructed modules M,, is equal to 
zero, And so we can apply Theorem 6.1.9 to it. As a result, we obtain a 
field L = M,,/(a; —1)M, = A/I. Here I is an inhomogeneous prime ideal 
which possesses two properties important for us: (1) 2; = 1 (mod J) and (2) 
LDA SS figs des 

The field L is an extension of K of degree deg M,,. But on the one hand, 
deg M,, is not divisible by p while, on the other hand, the degree of any 
extension of K is of the form p*. Hence, L = K. 
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Let a; € K be the image of x; under the natural projection 
A=k{ao,...,@n] ~ A/IT=L=K. 


Property (1) implies that a; = 1, and so a = (ao,...,a@n) # 0. Property (2) 
implies that fi(a) =--- = fn(a) =0. 


Theorem 6.1.11 enables to prove purely algebraically, for the case of poly- 
nomial functions, the well-known Borsuk-Ulam theorem on the common zero 
of odd functions on the sphere. 


Theorem 6.1.12. Let q,...,dn € R[x71,.--,2n41] be odd polynomials, i.e., 
qi(—x) = —qi(x). Then these polynomials have a common zero on the unit 
sphere 3 +---+22,,=1. 


Proof. Let us pass from q; to the homogeneous polynomial g; with the help 


of an extra indeterminate x9. Under this passage the monomial xf! -...- 2%" 
in g; is replaced by wor] -...- x7", where mp = deg q — m1 — --- — Mn. 


The degrees of all terms of any odd polynomial are odd, and so the numbers 
deg q; and m, +---+ mz, are odd, and therefore mo is even. Having replaced 
x by a] +--+ +22,, we obtain from g; a homogeneous polynomial fj of 
odd degree. By Theorem 6.1.11 the polynomials f,,..., fn have a common 
zero @ = (@1,.--,Qn41) # 0. For all t € R the point ta is also a common 
zero of the homogeneous polynomials f1,..., fn, and so we may assume that 


a? +---+a2,, =1. Then g(a) = G(1,a) = f,(a) = 0, as was required. 


From Theorem 6.1.12 we can derive the usual Borsuk—Ulam theorem for 
odd continuous functions gi,..., Gn by approximating these functions by poly- 
nomials. 


6.2 Grodbner bases 


In solutions of various computational problems related to ideals in polynomial 
rings, Grobner bases are very convenient. Bruno Buchberger introduced this 
notion in his thesis [Bul] written under the scientific guidance of Wolfgang 
Grobner, see also [Bu2]. Buchberger also suggested a convenient algorithm 
for calculating Grobner bases; this made Grobner bases an effective compu- 
tational tool. 

Our exposition of the theory of Grobner bases is largely based on the first 
chapter of the book [Ad]. For further details with various aspects of the theory 
of Grébner bases, see the book [Bu3]. 
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6.2.1 Polynomials in one variable 


In case of polynomials in one variable over a field K, the algorithm for finding 
a basis of the ideal is based on the division with residue of one polynomial by 
another one. The first step in division with residue is performed as follows. 
Let 


f(a) = an" +-+++ ao and g(%) = bma”™ +--+ bo, where n > m. 


Set 
fila) = f(x) - g(x). 
If deg f; => deg g, we apply the same procedure to f;, and so on. Finally we 


obtain f = qg +r, where degr < degg (or r = 0). Here the polynomials q¢ 
and r are uniquely defined. 


bye” 


Theorem 6.2.1. Any ideal I in the ring K{2] of polynomials in one variable 
is a principal one, t.e., is generated by one element. 


Proof. In I, select a polynomial g of the least degree. Let f € J. Then 
f =aqg+r, where degr < degg. But r = f — qg € I, and hence r = 0. This 
means that J is generated by g. 


Let I(fi,---, fn) be the ideal generated by fi(x),..., fn(a). The polyno- 
mial g(a) which generates this ideal is denoted by (f1,..., fn) and is called 
the greatest common divisor (GCD) of the polynomials f,(x),..., fn(x). The 
greatest common divisor possesses the following properties: 

(1) fi,---, fn are all divisible by g; 

(2) if fi,..., fn are all divisible by a polynomial h, then h is divisible by 


g. 
Property (1) follows from the fact that fi,...,fn © I(fi,.--,fn) = I(g). 
Property (2) follows from the fact that g € I(fi,..., fn), Le., 


g=uUfit---+Unfn, where u1,...,Un € K [a]. 


Properties (1) and (2) determine the polynomial g uniquely up to a con- 
stant factor. Indeed, if the polynomials g; and g2 are divisible by each other, 
they are proportional. 

The greatest common divisor (fi, f2) of two polynomials f; and fo can be 
found with the help of Euclid’s algorithm. 

From properties (1) and (2) it follows easily that 


(fi, fo,---; fn) = ( fis. Joewtcn da): 


This remark reduces the calculation of the greatest common divisor of n poly- 
nomials to the calculation of the greatest common divisor of two polynomials. 
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6.2.2 Division of polynomials in several variables 


To determine the division with residue for polynomials in several variables, we 
have to fix an order in the set of monomials. In what follows we assume that the 


monomials are ordered lexicographically, i.e., the monomial x® = xvf*-...-a0" 


is greater than x? = it -...+ 08> if ay = Bi,...,0% = Be and aps. > Bei 
(perhaps k = 0). 

The expression f = dgr% + --- will mean that agx 
of f, ic., x* is the highest monomial entering /. 

Let f =aqx%+--+ and g = bgx® +--+ be two polynomials in n variables. 
If a term c,x7 of f is divisible by x, we define 


a 


is the highest term 


Cy x" 
bear? g- 


A=f- 


If a term of f; is divisible by 7° we apply to f; a similar transformation etc. 

For this process to converge after finitely many steps, we should proceed, 
for example, as follows. For the term cyxz7? we take the highest of all the 
monomials of f divisible by x. In this way the order of the highest term of 
f divisible by x° will be strictly decreasing. Clearly, any strictly decreasing 
sequence of monomials in n variables is finite. Indeed, after finitely many 
steps, first x, vanishes, then after finitely many steps x2 vanishes, etc. 

Similarly, one can define division with residue of a polynomial by several 
polynomials. As a result, we obtain a representation 


fHufit---tusfs+r, 


where the polynomial r has no terms divisible by the highest monomial of the 
polynomials f;,..., f;. In this case we say that r is the residue after division 
of f by polynomials f;,..., f,;. Observe that r is not uniquely defined. One of 
the possible definitions of the Grobner basis consists precisely in the fact that 
fi,.--, fs is a Grobner basis if the residue after division of any polynomial f 
by fi,..-,fs is uniquely defined. 


6.2.3 Definition of Grobner bases 


We say that (nonzero) polynomials gi,..., 9: € I constitute a Grédbner basis 
of the ideal J if the highest term of any (nonzero) polynomial f € I is divisible 
by the highest term of one of the polynomials g1,..., 4. 


Theorem 6.2.2. The polynomials g1,...,94 form a Grobner basis of an ideal 
I if and only if one of the following equivalent conditions holds: 

(a) f eI <> the residue after division of f by g1,..-, 9 is equal to 0; 

(b) fel <=> f =X hig; and the highest monomial of f is equal to the 
highest of the products of the highest monomials of h; and g;; 

(c) The ideal L(I) generated by the highest terms of the elements of I is 
also generated by the highest terms of the polynomials gi,..., Gt. 
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Proof. First, we prove that if g1,...,g: is a Grobner basis of J, then 
condition (a) holds. It suffices to prove that if r is the residue after division 
of f € I by gi,.--, 91, then r = 0. Clearly, r = f — So hig; € I. Therefore, 
if r ~ 0, the highest term of r is divisible by the highest term of one of the 
polynomials g;,..., 9, which contradicts the definition of r. 

(a) => (b) By definition of division with residue, f = )> higi tr, where the 
highest monomial of f is equal to the highest of the products of the highest 
monomials of h; and g;. Condition (a) implies that if f € J, then r = 0. 

(b) => (c) If f =ar%+--- € I, then f = > higi, where h; = bx + --- 
and g; = qa +--+ and all the monomials x°'2% are not greater than x. 
Therefore ax® = Yb cga’, where the sum runs over the 7 for which 
gig = x. Since c.x% € L(I), it follows that ax® € L(I). 

It remains to prove that if (c) holds, then gi,..., 9; is a Grébner basis. Let 
f=ax°+---€J. Then 


ax® = y bx? cpa, 
i 


where the c;x” are the highest terms of some of the polynomials gj,..., 9. 
Clearly, x® is divisible by x for some 7. 


Corollary. If gi,...,g, is a Grobner basis of an ideal I, then the poly- 
nomials gi,.-.-,g4 generate I. 


This follows from (a). 


Theorem 6.2.3. Every nonzero ideal I C K[a1,...,%n]| possesses a Grobner 
basis. 


Proof. Consider an ideal L(I) generated by the highest monomials Xq of 
all the polynomials gq = daXa +::: € I. Clearly, f € L(L) if and only if any 
monomial of f is divisible by a monomial X,. By Hilbert’s basis theorem, the 
ideal L(J) is generated by finitely many monomials f,,..., f,. Every monomial 
of any of these polynomials is divisible by a monomial X,. As a result, we 
obtain a finite set of monomials X1,...,X, which generate the ideal L(J). 
These monomials are highest monomials of the polynomials gi,...,g:. By 
Theorem 6.2.2 (c) the polynomials g1,...,g; generate a Grobner basis of I. 


We will say that polynomials g1,...,g: constitute a Grobner basis if they 
constitute a Grobner basis of the ideal which they generate. 


Theorem 6.2.4. Nonzero polynomials g1,..., 9, constitute a Grobner basis if 
and only if the residue after the division of any polynomial f by gi,..-, 94 is 
uniquely determined. 


Proof. First, suppose that the polynomials g1,..., g, constitute a Grobner 
basis. Let r; and rz be residues after division of f by gi,...,9:. Then the 
polynomials f — r; and f — rg belong to the ideal J generated by gi,..-., 9. 
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Therefore ry — rz = (f —r2) —(f—1i) € I. By definition of the Grébner basis, 
the highest monomial of the polynomial r; — r2 is divisible by the highest 
monomial of one of the polynomials g),...,g;. On the other hand, neither 
r, nor rz has terms divisible by the highest monomials of g1,...,9:. Hence 
TT, -—T2. = 0. 

Now suppose that the residue after division of any polynomial f by 
g1,---,g4 is uniquely determined. We have to prove that if f € J, then the 
residue r after division of f by gi,..., 94 is zero. 

First let us show that, if a is a number, then the polynomials f and 
f —ax%g;, where x® is a monomial, give identical residues after division by 
gi;---,gt- Recall that, during the division with residue, an elementary trans- 
formation consists in annihilating a monomial cyx7 of f by replacing f with 
f — dx gj. 

Let g; = br? +---. If, for one of the polynomials f and f —ax%g;, the coef- 
ficient of e**+% vanishes, then the polynomial with a nonzero coefficient of this 
monomial can be reduced by an elementary transformation to a polynomial 
with the zero coefficient. 

If the coefficients of x°+ for both polynomials f and f—ax%g; are nonzero, 
then both polynomials can be reduced by an elementary transformation to the 
polynomial f — cx%g; with a nonzero coefficient of °+?. 

In all the cases the polynomials f and f — ax%g; can be reduced by an 
elementary transformation to the same polynomial. Hence, after division by 
gi,---,9t, their residues are identical (we make use of the assumption on the 
uniqueness of the residue). 

Now it is easy to prove the desired result. If f € J, then f = Yo higi. 
Having expressed each polynomial h; as the sum of monomials, we obtain 
f = S$aqx%g;,,. The polynomials f and f — > aqx%gi, = 0 give the same 
residues after division by gi,...,g9:. But, for the zero polynomial, the residue 
after division is equal to zero. Hence the same holds for f. 


6.2.4 Buchberger’s algorithm 


None of the preceding definitions of the Grobner basis allowed us to determine 
in finitely many steps whether the set gi,..., 9 is a Grobner basis or not. Let 
us give at last a definition which enables us to deal with this. 
Let f = av%+---,g = bx? +--- and let x7 be the least common multiple 
of x® and «°°. Set 
xy xy 


a= ae 
We construct S(f,g) so that the highest terms of two of its constituents cancel. 


Theorem 6.2.5 (Buchberger). The polynomials gi,...,9: constitute a 
Groébner basis if and only if, for alli 4 j, the residue after division of S(g:,9;) 
by g1,---,9¢ 18 Zero. 
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Proof. If gi,...,g¢ constitute a Grobner basis, then by Theorem 6.2.2 (a) 
the polynomial S(g;,9;) which belongs to the ideal I generated by the polyno- 
mials gives residue 0 after division by them. Therefore we only have to prove 
that if the residue after division of S(g;,9;) by g1,.--,g¢ is 0 for alli # j, then 
the polynomials g;,..., 9; form a Gr6ébner basis, i.e., any polynomial f € I 
can be represented in the form f = 5° hAigi, where the highest monomial of f 
is equal to the highest of the products of the highest monomials of h; and g; 
(see Theorem 6.2.2 (b)). 

Let us first prove one auxiliary statement. 


Lemma. Let fi,..., fs be polynomials with the same highest monomial 
x. If the highest monomial of f = >> Afi, where A; are numbers, is strictly 
smaller than «°, then f = > mis S(fi, fy)- 
i<j 
Proof. By the hypothesis f; = ajx° +--+ and f; = ajv% +---, and so 
S(fi,f;) = £- A. It is also clear that 


ai 


f= yo = Aja, (4 2) + (Aa, + Aza2) (2 = #) esas 
ay 


a2 a2 a3 
vee (Agay bess + Ag isi) (= *) + (Ajay +--+ 4 Neale 
Gs—1 as Qs 


It remains to observe that Aya, +---+A,as = 0. Indeed, the coefficient of the 
monomial x* in the polynomial f = S> A; f; equals exactly Aya, +--+ + Asds 
and, by assumption, the highest monomial of f is strictly smaller than x®. 


There are several ways to represent f = ax® +--- € I in the form 


f= > hig. (x) 


Let h; = dja! +--+ and gi = Gx" +---. Denote the highest of the monomials 
xe, where i = 1,...,t, by 2°. Select a representation (*) so that the 
monomial x° is the minimal one. We have to prove that in this case 7° = x. 
Clearly, 2% cannot be higher than x°. Suppose that x° is higher than 2°. We 
may assume that 292% = 2° fori=1,...,M and, fori= M+1,...,t, the 
monomial 2° is higher than «a. 


M 
Consider the polynomial g = > bjx°'g;. The coefficients of x° in this 
i=1 
polynomial and in f coincide. Hence g is a linear combination of polynomials 
with the highest monomial «° and all the highest monomials cancel each other. 


In this case, thanks to the Lemma, 


= Ye (ona g;); (1) 
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where summation runs over pairs i,7 such that 1 <i < j < M. The highest 
monomials of «9g; and x°/g; coincide, and so 


6 6 a 


9 — 9G = Gis 3), 


S Gare a 95) = Cj XVI 


where xs is the least common multiple of «” and x”. 

By assumption, the residue after division of S(gi,9;) by g1,---,9¢ is equal 
to 0. The polynomial $' (x*'g;,7%/g;) is divisible by S(gi,9;), and hence the 
residue after its division by gi,...,g9¢ is also zero. The algorithm of division 
with residue gives a representation 


S (x g;, 2° g;) = Shige, 


where the highest of the products of the highest monomials of the polynomials 
hijv and g, coincides with the highest monomial of S (2 Gis roe 9). The latter 
monomial is strictly less than x°. Let us substitute the obtained representation 
of S (x%g;,x°ig;) into (1) and then substitute the obtained representation 
of g into f = g+---: Asa result, we obtain a representation of f that 
contradicts the assumption on the minimality of 2°. This contradiction shows 
that 2 2°, 


With the help of Theorem 6.2.5 it is easy to show that the following al- 
gorithm enables us to find a Groébner basis of the ideal generated by polyno- 
mials f1,..., fs. Let us calculate the residues after division of the polynomi- 
als S(fi, f;) by fi,..-,fs and add all the nonzero residues to the collection 
fi,.--, fs. Let us repeat this procedure for the obtained set of polynomials, 
etc. Clearly, this sequence of operations will terminate after finitely many 
steps, and Theorem 6.2.5 ensures that as a result we obtain a Grébner basis 
of the ideal generated by fi,..., f;. This algorithm for calculating a Grobner 
basis is called Buchberger’s algorithm. 


6.2.5 A reduced Groébner basis 


For the same ideal, Buchberger’s algorithm leads to distinct finite results de- 
pending on the choice of generators of the ideal and the sequence of operations. 
One can however modify the algorithm so that the final result only depends 
on the ideal J itself; this modification also belongs to Buchberger. 

First of all, let us ensure that the number of elements of the Grébner 
basis is uniquely determined. We call a Grébner basis gi,..., 9 minimal if 
gi =v +--+ and the monomials 7% and «° are not divisible by each other 
for i # 7. 

Any ideal I has a minimal Grobner basis. 


Indeed, let gi,...,g: be a Grobner basis of J. We may assume that g; = 
a +--+, Ifa! is divisible by x, then already go,..., gz is a Grobner basis 


240 6 Ideals in Polynomial Rings 


of I. Indeed, if f = «*+--- € I, then, by definition of the Grobner basis, 
x is divisible by 7% for some 7. But 2°! is divisible by x°?, and therefore 
x is divisible by «™ for 7 > 2. This means that go,..., 9: is a Grobner basis 
of I. Consecutively, deleting the polynomials whose highest monomials are 
divisible by the highest monomials of other polynomials of the basis, we can 
pass from an arbitrary Grobner basis gi,..., 9 to a minimal Grobner basis. 


Theorem 6.2.6. If g1,...,9, and fi,..., fs are two minimal Grobner bases 
of the same ideal I, then s =t and the highest monomials of the polynomials 
gi and fo), where o is a permutation of indices, coincide. 


Proof. Let gj = x°' +--+ and f; = 2°) +--- On the one hand, f; € I and 


gi,---,g9¢ is a Grobner basis of I. Therefore x! is divisible by x for some i. 
After renumbering we may assume that i = 1. On the other hand, g, € J and 
fi,.--, fs is a Grobner basis of I. Therefore 7 is divisible by x°) for some J, 


and hence «” is divisible by 2°/. From the minimality of the Grébner basis 
,..., fs, it follows that 7 = 1. The monomials x™ and x®! are divisible by 
each other, and so #7! = 2. 

Similar arguments show that «°? is divisible by 2%. From the minimality 
of the Grébner basis f,,..., f, it follows that 2% 4 x°1, i.e., i 4 1. Therefore, 
after renumbering, we obtain 7°? = x%2, etc. Clearly, the sets of polynomials 
gi,---,g, and fy,..., fs should be exhausted simultaneously, i.e., s = t. 


Now we can ensure that, not only the number of the elements in the 
Grobner basis is uniquely defined, but also the elements themselves. Call a 
Grobner basis g1,..., 9: reduced if g; = x®'+--- and the residue after division 
of gi by 91,---9i-1;Gi+1;---> 9 coincides with g;, i-e., none of the monomials 
that enter g; is divisible by x% for 7 # i. 

Obviously, any reduced basis is also a minimal one. With the help of a min- 
imal Grobner basis gi,...,9:, we can construct a reduced basis of J generated 
by the polynomials g,..., 94 as follows. Let h; be the residue after division 
of gi by g2,---, 92; let ho be the residue after division of gz by hi, 93,.--5 923 
let hg be the residue after division of g3 by hy, ha, g4,.--, 94; etc., let hy be the 
residue after division of g; by hy, ha,..., he-1. 

Then hy,..., 4 is the reduced Grobner basis of J. Indeed, the minimality 
of the Grobner basis gi,..., 9g: implies that the highest monomials of the poly- 
nomials h; and g; coincide for all 7. Hence hi, ..., hy is a minimal Grobner basis 
of I. Besides, after division of g; by hi,...hi-1, 9i41,---, ge we get the residue 
h; which does not contain the terms divisible by the highest monomials of the 
polynomials h;,...h;—1 and gj+41,...,94. The latter monomials coincide with 
the highest monomials of the polynomials hj41,...,h:. Therefore h1,... hz is 
a reduced Grobner basis. 


Theorem 6.2.7 (Buchberger). For any ideal I, there exists precisely one 
reduced Grobner basis. 
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Proof. We have just proved the existence of a reduced Grobner basis. It 
remains to prove its uniqueness. Let fi,..., f; and gi,...,gs5 be two reduced 
Grobner bases of J. The reduced bases are minimal, and so by Theorem 6.2.6 
we have s = ¢t and we may assume that the highest monomials of f; and g; 
coincide. Suppose that f;—g; 4 0. Then the highest monomial of f; —g; € I is 
divisible by the highest monomial of some polynomial g;. Here j 4 7 since the 
highest monomial of f; — g; is strictly less than the highest monomial of g;. On 
the other hand, if the highest monomial of g; divides the highest monomial 
of f; — gi, then it should divide some monomial of one of the polynomials f; 
and g;. But this contradicts the fact that the bases fi,..., f; and g1,..-,9¢ 
are reduced (recall that the highest monomials of g; and f; coincide). 


é 


Hilbert’s Seventeenth Problem 


7.1 The sums of squares: introduction 


7.1.1 Several examples 


It is not difficult to prove that any polynomial p(x) with real coefficients 
which takes non-negative values for all 2 € R can be represented as the sum 
of squares of two polynomials with real coefficients. Indeed, the roots of a 
polynomial with real coefficients can be divided into the real roots and pairs 
of complex conjugate ones. Therefore 


s t 


p(x) =a] [ (x — 2;)( Ly TN x— ap)™*, 


j=l 


where a, € R. If p(x) > 0 for all x € R, then a > 0 and all the numbers m, 
are even, and so the real roots also split into pairs. Hence 


1 1 
= (va[]@-=)) (vaT[@-%)). 
j=l j=l 
where some of the z; can be real. Let 
1 
Vo | | ( — 23) = a(2) + ir(a), 
j=l 
where q and r are polynomials with real coefficients. Then 
aT © — %) = q(x) — ir(z). 


As a result, we obtain p(x) = (q(x))” + (r(x))?. 
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For polynomials in several indeterminates, a similar statement is not al- 
ways true, i.e., there exist non-negative polynomials — by which we mean 
polynomials with real coefficients whose values are always non- negative for 
real values of the variables — that cannot be represented as the sum of squares 
of polynomials with real coefficients. Hilbert was the first to prove this in 1888 
[Hil] but he did not give an explicit example of such a polynomial. The first 
simple example was given by T. Motzkin in 1967. 


Example 7.1.1. [Mo] The polynomial 
F(a,y) = 2°?y?(x? +y*? —3)+1 


is non-negative but it cannot be represented as the sum of squares of polyno- 
mials with real coefficients. 


Proof. First let us verify that F(z,y) > 0. If « = 0 or y = 0, then 
F(x,y) = 1. We therefore consider ry # 0. In this case, x7, y? and «~?y~? 
are positive and their product is equal to 1. Hence 


a + y? + ey 3; 


and therefore x?y?(x? + y? — 3) +1 > 0 as required. 

Now, suppose that F(x, y) = >> f;(x,y)*, where f; are polynomials with 
real coefficients. Then >> f;(x,0)? = F(x,0) = 1. Hence f;(x,0) = c; is a 
constant, and therefore f;(x, y) = c;+yg;(z, y). Similar arguments show that 
f(x,y) = cj + 2g; (x,y). Clearly, cj = cj and f;(x,y) = cj + xyh;(x, y). Thus, 


ey(o? +y° —3) +14 °y? Soh? + 2QayS- ejhy +306, 


i.e., 


ey (a? +y?—3)—2 ay hy = 2ay Doh +\oG -1. 


All the monomials on the right-hand side of this equality are of degree no 
higher than 3, and all the monomials on the left-hand side of this equality are 
of degree no less than 4. Indeed, 


1 
degh; = deg f; —-2 < ~deg FP —2=1. 
J J 9 


Hence, «7 y?(x?y? — 3) — 2?y? 37h? = 0, and therefore x? + y? —3 = S>h7. A 
contradiction since x? + y? —3 <0 forx=y=0. 


Example 7.1.2. (R. M. Robinson, 1973) The polynomial 
Sey aa(e =1 ty <1 <( -De-De* +9 =1) 


is non-negative but it cannot be represented as the sum of squares of polyno- 
mials with real coefficients. 
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Proof. First let us verify that S(x,y) > 0. This is obvious for the points 
lying in the non-shaded part of the plane on Fig.7.1 since for any such point 
either x? + y? —1 < 0 and (2? —1)(y? — 1) > 0, or 2 +y?—1 > 0 and 
(x? — 1)(y? — 1) <0. But $(z, y) can be xpressed differently, namely, 


S(x,y) = (a? + y? — 1)(a? — y?)? + (@? - 1)(y? - 1). 


This expression makes it clear that S(a,y) > 0 for the points lying in the 
shaded domain since for any such point x?+y?—1 > 0 and (a?—1)(y?—1) > 0. 


FIGURE 7.1 FIGURE 7.2 


Now, suppose that S(z,y) = >> f;(z,y)?. The function S' vanishes at 8 
points depicted on Fig.7.2. Therefore, at these points, each of the functions f; 
vanishes. But deg f; < 4 deg S =3 and if a curve of degree no greater than 3 
passes through the 8 points indicated, it necessarily passes through the origin 
as well, as we will prove in a moment. Thus, f;(0,0) = 0 for all 7, and hence 
S(0,0) = 0. But, obviously, $(0,0) = 1. The contradiction obtained shows 
that S(x,y) cannot be represented as the sum of squares of polynomials. 

The proof of the fact that any cubic curve passing through the 8 inter- 
section points of the lines p; and q; (4,7 = 1,2,3) must pass through the 
9th point can be found in the book [Pr3]. For the configuration of the points 
considered, we can give a simpler proof. Let us ascribe weight 1 to the points 
(1,41), weight —2 to the points (£1,0) and (0,41), and weight 4 to the 
origin (0,0). Consider the sum over all these points of the values of the func- 
tion x?y? multiplied by the corresponding weights. If pg = 0, the sum is zero. 
If p > 0 and q > 0, only the points ( 1) give a non-zero contribution to 
the sum. Moreover, the sum is non-zero only if both p and q are even. But the 
polynomial f; of degree not greater than 3 has no such monomials. Therefore 
the weighted sum of the values of f; over the 9 points considered is zero. In 


— 
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particular, if f; vanishes at 8 of the points, then it vanishes at the 9th point 
as well. 


Example 7.1.3. (R. M. Robinson, 1973) The polynomial 
Q(z, y,z) = 27'(@ — 1)? +y°y— 1)? + 27(@— 1) + Qeye(e ty +z —2) 


is non-negative but it cannot be represented as the sum of squares of polyno- 
mials. 


Proof. Suppose that Q(x, y,z) = > f(x,y, z)?. Then the degree of each 
polynomial f; does not exceed 2. Since the function Q(x, y,z) vanishes at 
all the points (x,y,z) with the coordinates x,y,z = 0 or 1, except for the 
point (1,1,1), then the functions f; also vanish at all these points. As we 
will establish shortly, this implies that f;(1,1,1) = 0. But then Q(1,1,1) =0 
whereas, evidently, Q(1,1,1) = 2. 


FIGURE 7.3 


Let us ascribe to the 8 points considered weights +1 as indicated in Fig.7.3 
and consider the weighted sum over these points of the values f;(x,y, z). It 
is easy to verify that the sum considered vanishes for the following functions: 
1,2, ry, x. Hence it is equal to zero for the function f;(a, y, z) as well, since 
deg f; < 2. Hence, if the function f; vanishes at any seven of the 8 points 
considered, it vanishes in the 8th point as well. 

Let us prove now that Q(z, y, z) > 0. To this end, let us express Q variously 
as 


Q = 27(@— 1)? + (y(y— 1) — 22-1)" + Qe 
= y*(y — 1)? + (2(2- 1) — a(@ — 1)? + Q, 
= 2 (2-1)? + (a(@-1)—¥(y- 1)" + Qz, 
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where 


Qe = 2yz(a+y—1)(a+2—-1), 
Qy = 2x2(z + y—1)(y+z—1), 
Q. = 2ey(a@+2-1(y+2-1). 


It suffices to prove that at any point (x,y, z) one of the functions Qz, Qy, Q- 
is non-negative. But these functions cannot be simultaneously negative since 
their product is the square of the polynomial 


2W/Ixyz(e + y—1)\(a+z—1)(y+z—-1). 


Example 7.1.4. (Anneli Lax, Peter D. Lax, 1978) The form 


A(a) = A(x) + Ag(a) + As(a) + Ag(x) + As (a), 


where v = (#1, %2,%3,%4,25) and A;(x) = [[(#; — x;), is non-negative but it 
xi 
cannot be represented as a sum of squares of forms. 

Remark. The form A(a) only depends on differences of the variables, and 
so it can be represented as a form in four indeterminates. On dividing this new 
form by the 4th power of one of the indeterminates, we obtain a polynomial 
of degree 4 in 3 variables. 


First we verify that A(x) > 0. The value of A(a) does not vary under any 
permutation of the variables, and so we may assume that 


U2 % > %3>%4 > 4X5. 


In this case 


A(x) + Ao(x) = 


= (21 — a2) (v1 — 23)(%1 — 24) (21 — 25) — a — &3) (2 — 2a) (G2 — x5) > 0. 


We similarly prove that A4(x) + A5s(x) > 0. It is also clear that A3(zx) is the 
product of two non-positive and two non-negative factors, and so A3(a) > 0. 

Now suppose that A(x) = 5> Q;(x)*, where the Q; are quadratic forms. If 
any x; is equal to any other x, then A(x) = 0, and hence Q(x) = 0. Therefore 
the quadric Q;(x) = 0 in RP‘ contains a projective line v1 = %2,%3 = 24 = 25. 
Under permutation of coordinates we obtain 10 lines of this form (to determine 
a line one should select 2 coordinates of 5). These lines intersect a generic 
hyperplane at 10 points and the quadric Q;(x) = 0 should pass through these 
points. But in three-dimensional space the quadric does not pass through 
10 generic points. Therefore one should expect that the form Qj vanishes 
identically. We will establish shortly that this is indeed the case, and therefore 
reach a contradiction. 
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Let Q(x) = SO ejjaiv; with cj; = c;;. By the hypothesis 


0 = Q(s,s, t,t) = (er + 2c12 + c22)874 


+2(c13 + C14 + C15 + C23 + Cos + C25) 8t4 


2 
+(e33 + Caa + C55 + 234 + 2e35 + 2c45)t°. 


Hence 
C11 + 2c12 + C22 = 0, (1) 
C13 + C14 + C15 + C23 + Co4 + C25 = 0, (2) 
€33 + C44 + C55 + 2€34 + 2035 + 2c45 = 0. (3) 


Moreover, similar equalities obtained under any permutation of indices also 
hold. In particular, (1) implies that 


C33 + 2c34 + C44 = 0. (4) 
Subtracting (4) from (3) we obtain 


C55 + 2c35 + 2c45 = 0. 


Let c55 = A. Then, for distinct i and j different from 5, we have cj5-++¢j5 = =a 


As ane 
Hence C15 C25 C35 C45 = Similarly, C21 C31 C41 C51 15 


nN aN 
7 and so on. As a result, we get cj = A and cj; = = for i ~ 7. But then 
(2) implies that \ = 0, i.e., Q(x) = 0 for all x. 


7.1.2 Artin-Cassels-Pfister theorem 


In subsection 7.1.1 we gave several examples of non-negative polynomials that 
cannot be represented as the sum of squares of polynomials. In what follows 
we will show that any non-negative polynomial can be represented as the 
sum of squares of rational functions. But for polynomials in one variable, 
the difference between representations as the sum of squares of polynomials 
and as the sum of squares of rational functions is inessential, as the following 
statement shows. 


Theorem 7.1.5. Let K be a field of characteristic distinct from 2 and let 
f(x) be a polynomial over K. Suppose that 


f(x) =ouri(z)? +--+ + aarn(2)’, 
where a; € K and r;(x) are rational functions over K. Then 


f(2) = api (x)? ae eos OnPn(z)*, 


where p(x) are polynomials over K. 
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This theorem has a long history. In 1927, Artin [Ar1] proved that 
f(«) = Bipi (x)? SP aa BmiPm(z)*, 


where m is a number not necessarily equal to n. Next, in 1964, Cassels [Ca5] 
showed that one may assume that m = n. And, in 1965, Pfister [Pfl] proved 
that one may assume that 3; = a;. 

Theorem 7.1.5 can be applied as well to any polynomial f(a1,...,2,) in 
n indeterminates over a field L. For this, we take, for example, « = x; and 
allow K = L(a2,...,%p) to be the field of rational functions in indeterminates 
2,..-,%, over L. As a result, we see that in the representation of f as the 
sum of squares of rational functions we can remove any of the indeterminates 
in the denominators of these rational functions. But we cannot remove all the 
indeterminates simultaneously. 


Proof. For n = 1, the statement is obvious, and so in what follows we 
assume that n > 1 and a; # 0 for all i. It is convenient to carry out the proof 
in terms of quadratic forms over the field K(x). Let v = (v1,...,Un) bea 
vector with coordinates from K(x). Define 


yu, v) = 2 AZU;{VU;- 


We have to prove that if f € A[z] and f = y(u,u), where u; € K(x), then 
f = ¢v(w,w), where w; € K[az]. The quadratic form y(u,u) can be either 
isotropic, (i-e., y(u, u) = 0 for some u ¥ 0) or anisotropic (i.e., p(u, u) 0 for 
all u 4 0). 

Case 1: the form y(u,u) is isotropic. In this case we will not even need 
the condition f = y(u,u) for u; € K(x). In other words, for any polynomial 
f, there exists a vector u with coordinates from K [a] such that f = y(u,u). 

In the equality y(u, wu) = 0, we may assume that wu is a polynomial: indeed, 
it suffices to reduce the rational functions u; to the common denominator. 


We may also assume that the polynomials u;,...,un are relatively prime in 
totality. Then there exist polynomials v,..., Up such that wyvy +---+Unvn = 
Ii, 


Indeed, we first represent GCD(/fi, f2) in the form ui f; + u2fe; then we 
represent GCD( fi, fa, fs) in the form (u1 fi +u2f2)g1+u3f3, and so on. Having 
divided each polynomial v; by the number 2a;, we obtain a vector v such that 
y(u,v) = 4. But since 


yp(u,v + Au) = y(u,v) and y(v + Au, v + Au) = v(v,v) +A, 


and since we can replace v by v — y(v,v)u, we may assume that y(v,v) = 0. 
The identity 


e(futv, fut) = fp(u,u) + 2fe(u,v) + o(v,v) = f 


shows that any polynomial f can be expressed as f = y(w,w), where w = 
fut. 
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Case 2: the form y(u,u) is anisotropic. In this case we need the condition 
f = v(u,u) for u; € K(x). This is clear from the following example: K = R, 
f(z) =—1 and y(u,u) =u? +--+ u?. 

Let us multiply both sides of the equality f = y(u,u) by the common 
denominator of the rational functions u,,...,un. As a result, we obtain an 
equality of the form a, u?+:--+anu2 = fuz, where uo, ...,Un are polynomials. 
Among all the equalities of this form, we may select the equality with the least 
degree r of the polynomial up. We have to prove that r = 0. Suppose that 
r = degug > 0. Let us divide u; by wo with a residue, and therefore find a 
polynomial v; such that deg(u; — uovi) <r — 1. 

In addition to vectors u = (u1,...,Un) and v = (v1,...,Un), we consider 
vectors U = (Uo,..-,Un) and U = (v0,...,Un), where v9 = 1. Let us also 
consider the form 

P(z,Y) = 9(z,y) — fxoyo. 


By the hypothesis G(w, @) = y(u, u)— fue = 0. Moreover, G(0, 0) = v(v,v)—f, 
since v9 = 1. Therefore the equality (v,v) = 0 contradicts the condition 
r > 0. In what follows we assume that ((v,v) 4 0. 

This means, in particular, that the vectors u and v are not proportional. 
Therefore the vector 


W = G(U, d)U — 2G(U, v)v 
is nonzero and ((w,w) = 0 since 
P(AU — WY, AU — pW) = w(—2AG(U, V) + HEV, v)) = 0 


for AX = g(0,0) and pp = Y(U, v). 

Thus we have constructed a nonzero vector W = (wo, w) with polynomial 
coordinates such that p(w, w) — fw? = 0. To reach a contradiction, it suffices 
to verify that deg wo < r. Since vg = 1, we obtain 

n n 
Wo = PU, V)uo — 2G(U, V)v9 = (= ajv? — ) uo — 2(> AjU;Vi — fuo) 
7 


. 2 up au? < 2 
py Qj (v' ug — 2u;v; + a) = >» a + fuo == > Qi (Uj _— Uov;) 
i=l i 


n 
because S> aju? = fue. 
i=l 
Recall that deg(u; — uov;) <r — 1. Hence 


deg wo = deg (> ay (uy — wots) ) —degug < 2(r-—1) -r=r-2. 


i=l 


With the help of Theorem 7.1.5 we can indicate a non-negative polynomial 
in n indeterminates that cannot be represented as the sum of n squares of 
rational functions. 
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Theorem 7.1.6. The polynomial x} +--- +2? +1 cannot be represented as 
the sum of n squares of rational functions in indeterminates x1,...,%n Over 


R. 


Proof. Set K = R(a1,...@%p—1) and x = x», in the conditions of Theorem 
7.1.5. Suppose that we have represented the polynomial x? +---+ 22 +1 as 
the sum of n squares of rational functions in 71,...,2%,. This means that the 
polynomial x? + d, where d = xj +---+22_,+1€ K, can be represented as 
the sum of n squares of elements of the field (a). In this case, by Theorem 
7.1.5, the polynomial x? + d can be represented as the sum of n squares of the 
elements from K[’], i.e., 


n 
e?+d=)- (aint aax+ aie? +--+)”. 
i=l 


Obviously, in such a representation, we have aj2 = aj3 = --- = 0. Therefore 
n 
et+d=S (at bi)’, a;,b; € K. 
i=1 


Let us substitute in this identity the value x = c such that c? = (anc + bn)?, 


Le, (Lta,)c+b, = 0 (for a, 4 +1, we may take any sign and, for a, = +1, 
n-1 

only one of the signs will do). As a result, we obtain d = S> (a;c+;)?, where 
i=l 

c,a;,b; € K. In other words, the polynomial 27 + ---+22_, +1 admits the 

representation as the sum of n—1 squares of rational functions in x1,...,%p—1 


over R. Repeating similar arguments we will obtain finally that the polynomial 
x? + 1 is the square of a rational function in x; over R. A contradiction. 


7.1.3 The inequality between the arithmetic and geometric means 


The inequality between the arithmetic and geometric means consists in the 
following. If 71,...,%p are non-negative numbers, then 


%+@g++:-+ Ly, 
n 


> W741 %Q°+++ Ly. 
Let us replace x; by t?”. Then this inequality becomes 


f2r t2” ete p27 
PG dt i Ee oe. 2 PSG 
n 
Theorem 7.1.7 (Hurwitz [Hu]). The polynomial P(ti,...,tn) can be rep- 


resented as the sum of squares of polynomials. 
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Proof. Let y; = t?. For any function f(y1,-.-,Yn), we define 
Sf(y, ie : Yn) = ye AC EGOy Piece Veta) 


cESn 


For example, 


Aah Sereda diaie tannin tied (1) 
Sy, “Yn = nly Yn 
Consider the functions 
yi. = S (yt —¥3")(y1 — y2)) , 
y2 = 8 ((yf? — v7) (yn — yo)ys) 5 
vs = S((yp > — yd *)(y1 — y2)ysya) ; 
Yn—-1 = 5 ((y1 — yo) (yi — Y2)Y3sY4*---* Yn) 
It is easy to verify that 
gy = Syt + Syz — Syf* — Syg~*y = 2Sy? — Wy? "yo. 
Similarly, 
y = 2S yp "yo — 2Syi *yoys, 
y3 = 2Sy)*yoys — 2Syi °yoysys, 
Yn-1 = 2S y7yoys eee Yn—1 — 2S Y1Y2---- Un: 
Hence 
Yi t pat: + Pn-1 = 2Sy7 — 2yiy2..- Yn. 
Taking into account relations (1) we obtain 
n + mh + en + nm 1 
ye +t = yi Pr + 2 +79 + Pn-1); 
ie., 
t27 4 ¢2m 44... 4 420 1 
tH Fi Pi + 2 +++ + Pn-1); 
where 
vr = 8 (yt * — 98 *)(q1 — yo)ysya* +--+ Yer) 
= S((yi — yo)? (yp 8 + ype tye tee typ tT) ysya ss. Y-+1) 
ane ((@ SAG ae ee ca pe aa ty.) ; 


Thus yx is the sum of squares of polynomials in t),...,tn- 
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7.1.4 Hilbert’s theorem on non-negative polynomials p4(z, y) 


Let pz be a polynomial of degree k. In section 7.1.1 we gave examples of non- 
negative polynomials pg (x,y) and p4(a, y, z) that cannot be represented as the 
sums of squares of polynomials. For polynomials po(x1,...,2n), there are no 
such examples. Indeed, to any polynomial p2(21,...,%n), there corresponds 
the quadratic form 


a ss 
Yn+1 : : Yn+1 


Fo(y1,--+,Yn4i) = Yo41Pal 


and any quadratic form can be represented in the form 


fit--+fi-fia-ocn Leer 
where fi,..-, fn41 are linear forms. Clearly, the polynomial p2 is non-negative 
only if Seva Se frn41 =(); 


It is much more difficult to prove that any non-negative polynomial p4(a, y) 
can be represented as the sum of squares of polynomials. 


Theorem 7.1.8 (Hilbert). Any non-negative polynomial pa(x,y) can be 
represented as the sum of three squares of polynomials. 


We will give two proofs of this theorem. The first proof is simpler but it only 
enables us to prove a weaker statement, namely, we will show that p4(a, y) 
can be represented as the sum of several (not necessarily three) squares of 
polynomials. The second one is the original Hilbert’s proof. 

It is more convenient to give both proofs not for polynomials but for ho- 
mogeneous forms F(x, y, 2). 


First proof. (Choi-Lam) We will prove that any non-negative homoge- 
neous form Fy(a#,y,z) can be represented as the sum of squares of homoge- 
neous forms. The first part of the proof concerns forms of any degree in any 
number of indeterminates. 

To a pair of forms P and Q of degree n in m indeterminates, we can assign 
the form AP + yQ, i.e., the set of all such forms is naturally endowed with 
the structure of a linear space. The origin of this linear space is obviously the 
zero form. 

The non-negative forms constitute a closed convex cone C' with the vertex 
at the origin O. Clearly, if Q is a non-zero form and Q € C, then —Q ¢ C. 
Therefore any plane passing through O and Q intersects C' in a (closed) angle 
Q:OQ2 whose value is strictly less than 7. The form @ is a convex linear 
combination of the forms Q, and Qa, i.e. Q = A1Q1i + A2Q2, where A1, A2 > 0 
and Ay + Ag = 1. 

Let us draw hyperplanes of support to the cone C passing through the 
rays OQ, and OQ». They intersect C' in certain convex cones of strictly lesser 
dimension. Consider the section of each of these cones by a plane passing 
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through O and Q. After several such operations we will necessarily arrive at 
the cones of dimension 1 (rays). 

A point A of a closed convex cone C is called extremal if there exists a 
hyperplane of support intersecting the cone C only along the ray OA. In other 
words, 


the point A is extremal if it is not the inner point of the segment whose 
end-points belong to C but do not lie on the ray OA. 


The above described construction shows that 


any non-negative homogeneous form Q is a convex linear combination of 
extremal non-negative forms. 


So far we have considered forms of any degree in any number of indeter- 
minates. The following statement only holds for forms of degree 4 in three 
indeterminates. 


Lemma 7.1.9. Any non-negative homogeneous form T (x,y,z) #0 of degree 
4 can be represented in the form 


T= +7, 
where q #0 is a quadratic form and T, a non-negative form. 


Corollary. Any extremal non-negative form T(x,y,z) of degree 4 is a 
total square. 


The corollary obviously follows from Lemma: for an extremal non-negative 
form T, the decomposition T = gq? + T; must be trivial, i.e., g? and T; should 
be proportional. 

In turn, the corollary obviously implies the theorem. Indeed, the convex 
linear combination of extremal non-negative forms of degree 4 in 3 indetermi- 
nates is a sum of squares of quadratic forms. 


Proof. Let Z(T’) be the set of zeros of the form T considered up to pro- 
portionality, i.e., the set of zeros of this form in RP?. 

Case 1: Z(T) = 0. On the unit sphere x? +y? +z? = 1 the function attains 
a minimal value p > 0, and so T(a, y, z) > w(x? + y? + 27)? for all (x,y, 2). 

Case 2: Z(T) consists of one point; without loss of generality we may 
assume that 7'(1,0,0) = 0. In this case the coefficient of x* is equal to 0, and 
so 

T(2,y,2) = 2%(ary +022) +2°f(y,2) + 2xg(y,z) + hy, 2). 


If ay 4 0 and ag ¥ 0, then as x — +co we can obtain negative values of T. 
Hence, 
T(a,y,z) =2° f + 22g +h. 


It is also clear that f > 0 and h > 0. 
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In the decomposition 
fT = (xf +9)? + (fh—9?) 


the form fh—g? is non-negative. Indeed, if fh—g? < 0 at (a,b), then f(a, b) # 


g(a, b) 
we see that fT < 0 at (x, a,b). 
F(ab) wee 


Any non-negative quadratic form f(x,y) can be represented either as the 
square of a linear form or as the sum of two squares of linear forms. Accord- 
ingly, consider two possibilities. 

(a) f = f?, where f; = ay + Gz. At the point (—G,a) we have fh — g? = 
—g? <0. Hence g(—(, a) = 0 since the form fh—g? is non-negative. Therefore 
g = figi, and hence 


fT > (af +9) = (eff + figs)? = fi(afhi +m)? = f(efit+ gi)? 


and thus T > (af; + g1)?. 
(b) f = f?+f$, where f; and f; are linear forms without non-trivial (i.e., 
distinct from the origin) common zeros. Then f(y,z) > 0 for (y,z) 4 (0,0). 
Suppose that fh—g? = 0 for (y, z) = (a,b) 4 (0,0). Then T = 0 for (a, y, z) = 
(—at b) 
F(a,b) 
one zero in RP?, namely (1,0, 0). 
Thus fh — g? > 0 for (y,z) 4 (0,0), and therefore f 


the unit circle. Hence fh — g? > yf? for all (y, z). Therefore 


fh—-g 
i 


0. Setting « = — 


Qs 6), This is a contradiction since, by the hypothesis, T’ has only 


2 


> p> 0on 


2 
T> > wf? = (Jaf). 
Case 3: Z(T) contains no less than two points; without loss of generality 
we may assume that T(1,0,0) = T(0,1,0) = 0. As in Case 2, the form T 
cannot contain the terms with x2* and x? nor can it contain terms with y* and 
y®. Hence 
T(z, y,z) = 2? f(y, z) + 2rzg(y, z) + 27h(y, z). 


In the decomposition 


fT = (af + 2g)? + 27(fh—g’) 


the form fh — g? is non-negative. 

While considering subcase (a) of Case 2 we did not use the fact that the 
form T has precisely one zero. Therefore, if f = f? (or h = h?), we may apply 
the same arguments. It remains to consider the case when f > 0 and g > 0. 
Again, consider two possibilities. 

g(a, b) 


and define 
f(a, b) 


(a) fh —g? has a non-trivial zero (a,b). Let a=— 


256 7 Hilbert’s Seventeenth Problem 


T(z, y,2) =T(e#+ az,y,z) = 2° f +222(g+af)+27(h+2ag+a7f). 


At (a,b), we have 


= 2 hf — @2 
h+2ag+o°f=h+ 2 g+ Bf =h— gf = a = 0. 


Therefore h + 2ag + a? f = h?. Thus Ti(z, y, z) > (zhi), and therefore 


T(x,y,2) _ Ty (a ae a2z,y, 2) 2 (zhi(zx _ az,y,2z))* : 


b) fh —g? > 0. Then 


fh—g? 
—— 
(y? + 2) f a 


and therefore 
fT = (ef +29) +2°(fh— 9°) > 27(fh—g’) > wz*(y? + 2°) f. 


As a result, we obtain T > (,/fizy)? + (,/fiz”)? > (,/fiz”)?. 


Second proof. (Hilbert) The main idea of this proof is to consider the set 
A consisting of the real forms in three indeterminates that can be represented 
in the form f? + g? + h?, where f,g,h are real quadratic forms without non- 
trivial common zeros over the field of complex numbers. 


Lemma 7.1.10. The set A is open. 


Proof. The coefficients aj;, of the form >> a;;,2"y?z* can be considered 
as coordinates in R”. Therefore the map 


@:(f,ghheF=fP+yPt+h 


is an algebraic map R'® — R!° (the quadratic form in 3 indeterminates is 
determined by 6 coefficients and the form of degree 4 is determined by 15 
coefficients). 

It suffices to prove that if (f,g,h) € A, then the rank of the differential 
d® of the map @ at point (f,g,h) is equal to 15, i.e., dim ker = 3. Clearly, 


dP(u, v,w) = 2(uf + vg + wh), 


where (u,v, w) € R} is a triple of quadratic forms and uf +vg+ wh is a form 
of degree 4. 

The quadratic forms f,g,h have no non-trivial common zeros over C. Let 
us prove that the equation 
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uf +vg+wh=0 (1) 
(u,v, w are quadratic forms) then implies that 
u=vg-—ph, v=Ah-vf, w=wpuf—Ag 


for some A, 11, € C. 
It suffices to prove that 


u=Mg-pih, v=Ah—-wmf, w=pef —A2wWg 


because (1) then gives 


(Ar — A2)hg + (He — pi) f+ (V1 — v2) fg = 0. 


The curves f = 0,g = 0 and h=0 are distinct. So, on the curve f = 0, there 
is a point which does not belong to either g = 0 or h = 0. Having considered 
the values of f,g and h at this point, we obtain A; = Ag. We similarly prove 
that wy = 2 and yy = 12. 

Let us now prove, for example, that w = fof — A2g. By Hilbert’s Nullstel- 
lensatz the ideal generated by f,g and h contains a power of any polynomial 
since these forms have no common zeros. In particular, x” can be represented 
for some n in the form 


x” =rft+sg+th, (2) 


where r, 5,¢ are forms of degree n — 2. Consider equation (2) with the minimal 
n. From (1) and (2) we deduce that 


uft+vugt+wht =0, 2*w=rfwtsgu+thw 
and therefore 


x"w = (rw — ut) f + (sw — vt)g = af + bg, (3) 


where a and b are forms of degree n. 

If n = 0, we get the equality desired. 

If n > 0, we obtain a contradiction to the minimality of n. Indeed, for 
x = 0, the equality (3) becomes 


ao fo + bogo = 0, 


where ap = a(0,y, 2), etc. Since fo and go have no common zeros, a9 = dogo 
and bp = —do fo for a polynomial do(y, z). Set d(x, y, z) = do(y, z) and consider 
the polynomials a, = a — dg and b; = b+ dg. Clearly, 


aif +big=af+bg=2"w 


and a1(0,y, z) = 61(0, y, z) = 0, i.e., ay and b; are divisible by x. Dividing by 
x we obtain an equality of the form a2 f + bsg = «"~!w which contradicts the 
fact that n is minimal. 
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Thus, the kernel of the map 
d®: (u,v, w) + 2(uf + vg + wh) 
consists of the vectors of the form 
(vg — ph, Ah — vf, uf — Ag) = A(0, hk, —g) + w(-h, 0, f) + v9, -f,9), 


and therefore the dimension of the kernel is 3. 


Lemma 7.1.11. Let F € A\ A, where A is the closure of A. Then either F 
has a nontrivial real zero or, over C, the curve F = 0 has at least two double 
points. 


Proof. Clearly, F = f? + g? +h?, where f,g and h have a common non- 
trivial zero (a,b,c). If this zero is not real, then the points (a, b,c) and (@, b, 2) 
are two distinct double points on the curve F' = 0. Indeed, these points are 
zeros of the functions f,g,h, and so they are zeros of multiplicity 2 of the 
functions f?,g?,h?. Hence, they are zeros of multiplicity 2 of the function 
F=f?tg?t+h?. 


Let us now move on to the proof of the theorem proper. We have to prove 
that any non-negative form lies in A. The open set A is bounded by the 
surface 0A = A \ A. Let F, be an arbitrary non-negative form of degree 4 in 
3 indeterminates. If Ff, € A we have nothing more to prove. So let us assume 
that Ff, ¢ A. Then the segment FoF, where Fo € A is an arbitrary point, 
should intersect OA at a point F;. It suffices to prove that we can select Fo 
so that F; coincides with F, (then F; = F, € 0A C A)). Assume that F; 
is an interior point of the segment FoF. We can select Fo so that F; has a 
non-trivial real zero. Indeed, by Lemma 7.1.11 the forms that belong to 0A 
and do not have non-trivial real zeros correspond to curves with two double 
points, and such forms constitute a set of codimension no less than 2. Indeed, 
the curve F = 0 has a double point if the system of equations 


F=0, Fo =0, Fy, =0, F,=0 
has a solution. The first equation can be disregarded since 
ch, +yfy + 2B, = nF, 


where n is the degree of F' (in our case n = 4). The curves F, = 0 and F, = 0 
intersect at (n —1)? points. The curve F = 0 has k double points if the curve 
F, = 0 passes through the k intersection points of the curves F),, = 0 and 
F, = 0. This imposes k algebraic relations on the coefficients of F’. 

Thus, the form F; = (1 — t)Fo +tF) has a non-trivial real zero. But this 
contradicts the fact that F;, 4 F,. Indeed, Fy > 0 and F, > 0, and so, for 
t £1, the form (1 — t)Fo + tF, takes strictly positive values at all points 
distinct from the origin. 
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7.2 Artin’s theory 


This section is mainly devoted to Artin’s solution of Hilbert’s seventeenth 
problem on the representability of a non-negative polynomial as the sum of 
squares of rational functions. Artin’s proof gives no estimates of the sufficient 
number of these rational functions in the representation of a polynomial in 
n indeterminates. This estimate is due to Pfister: a non-negative polynomial 
in n indeterminates can be represented as the sum of 2” squares of rational 
functions. Pfister’s theory will be discussed in Section 7.3. 

In Sections 7.2.1 and 7.2.2 we give necessary preliminaries from the theory 
of real fields; the results of these sections are due to Artin and Shreier [Ar2]. 
The solution proper of Hilbert’s seventeenth problem is contained in 7.2.3. 
This proof is based on a theorem of Sylvester which enables us to calculate 
the number of real roots of a polynomial in terms of the index of a quadratic 
form (see page 33). Our exposition follows [Scl]. 


7.2.1 Real fields 
A field K is said to be ordered if it is split into three non-intersecting subsets 
K=NU{0}UP 


such that N = —P (the subset N of negative numbers and the subset P of 
positive numbers), where the sum and the product of two positive numbers 
are positive. 
For any ordered field, we may set « — y > 0 if x —y € P (we also write 
x>y ifeithe «—y¢ Porx=y). 
Set 
a fora>O, 
laj=<{0  fora=O, 
—a fora<0. 


It is easy to verify that |ab| = |a| - |b] and |a+ 6] < |a| + |d]. 

In any ordered field we have 1 > 0 since the opposite inequality —1 > 0 
leads to a contradiction because 1 = (—1)(—1) > 0 but if a > 0, then —a < 0 
for any a ¥ 0. In particular, the characteristic of any ordered field is equal 
to zero since 1+---+1 > 0 for any nonzero number of summands. In any 
ordered field 2? > 0. Indeed, both inequalities x > 0 and —x > 0 imply the 
same inequality x? > 0. 

There is only one ordering of the field Q, namely, | > 0 if and only 


if pq > 0. Indeed, the numbers . and pq are obtained from each other by 
q 


multiplication by q*? > 0. 

Any field Z in which —1 is the sum of squares (and the characteristic of 
L is distinct from 2) is an example of a field that cannot be ordered. Indeed, 
any element a of such a field L is the sum of squares 
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ie i=a\ 
= -1 
a a) 
The field K is called formally real if —1 cannot be represented as the sum 
of squares of of elements from K. An equivalent condition: if b7+---+62 = 0, 
where b;,...,b, € K, then bj =--- = b, = 0. For brevity, we will call formally 
real fields just real ones. 


The characteristic of any real field is equal to 0. Indeed, if the characteristic 
is equal to p, then -1 = 1? +---+1?. 
a 


p-1 
Theorem 7.2.1. Let K be a real field, and letace K. 
a) If a is the sum of squares of elements from K, then K(./a) is real. 


b) If K(./a) is not real, then —a is the sum of squares of some elements 
from K. 


Proof. To prove a), we suppose on the contrary that K(,/a) is not real. 
Then, in particular, K(./a) #4 K, i.e., /a ¢ K. Moreover, —1 is the sum of 
squares of some elements from K(,/a), i.e., there exist elements b;,c; € K 
such that 


—1= 07(b; + e/a)? = > 0? + 2,/ad> digg tad c?. (1) 
Formula (1) shows that, if > bic; 4 0, then /a € K. Therefore 


-1=P+ayc2. (2) 


Let a be the sum of squares of some elements from Kk. Formula (2) shows 
that in this case our supposition that K(./a) is not real leads to a contradic- 
tion. This proves statement a). 

To prove b), we write (2) in the form 


1458 
“ye 


It remains to observe that if p and qg are sums of squares, then so is ee ao 
qd q 


Corollary. For a real field K, one of the fields K(./a) and K(,/—a) is 


necessarily real. 


Proof. If K(/a) is not real, then —a is the sum of squares, and so K (./—a) 
is a real field. 


Theorem 7.2.2. Let K be a real field and let f € Ka] be an irreducible 
polynomial of odd degree. Then the field K(a), where a is a root of f, is real. 
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Proof. Let n = deg f. Suppose that the field K (a) is not real. Then 


—l= Lala)’, 


where g; are polynomials of degree no higher than n — 1. Since a is a root of 
1+ 5° gi(x)?, the latter is divisible by f, i-e., 


1 =D gil)? + h(a) (2), 


where h is a polynomial. Suppose if possible that h = 0. Then —1 = > gi(z)?. 
If max(deg g;) = m > 0, then the sum of squares of the coefficients of x” in 


the polynomials g; is equal to zero. If g; = c; € K, then —1 = S>c?. Both 
versions contradict the fact that K is real, and so h 4 0. 

The degree of the polynomial 5> g;(x)? is even and does not exceed 2n—2. 
Therefore the degree of h is odd and does not exceed n — 2. The polynomial 
h has an irreducible factor h; whose degree is odd and does not exceed n — 2. 
Let 6 be a root of hy. Then —1 = Y> gi(8)?, ie., —1 is the sum of squares of 
elements from K(3). Repeating for hy the same arguments as for f, we see 
that —1 is the sum of squares of elements from the field K(y), where 7 is a 
root of an irreducible polynomial of odd degree that does not exceed n — 4 
etc. Thus we have a contradiction. 


o] 


The field K is called real closed if it is real and any real algebraic extension 
of it coincides with Kk. 

Theorem 7.2.2 implies that in any real closed field any polynomial of odd 
degree has a root. 

A real closure of a real field K is any real closed field R algebraic over Kk. 

Any real field K has a real closure. Indeed, consider the partially ordered 
set of all real fields algebraic over K. By Zorn’s lemma this set has at least 
one maximal element, R. Clearly, R is a real closed field. 


Theorem 7.2.3. The real closed field R has precisely one ordering, namely, 
the nonzero element a € R is positive if and only if it is a square. 


Proof. Clearly, the equalities a = t? and —a = t3, where ti, tz € R, cannot 


t 
be satisfied simultaneously since otherwise —1 = (2) . Therefore it suffices 
2 


to prove that ta = t?, where t € R. 

If a 4 t”, then the field R(\/a) is a proper extension of R, so it is not real. 
In this case, by Theorem 7.2.1 (b), the element —a is the sum of squares of 
elements from R. Then by Theorem 7.2.1 (a) the field R(.,/—a) is real, and so 
it coincides with R. This means that \/—a € R, ie., —a = t?, where t € R. 


Theorem 7.2.4. Let K be a real field and suppose that the element a € K 
cannot be represented as the sum of squares. Then there exists an ordering of 
K for which a <0. 
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Proof. By Theorem 7.2.1 (b) the field K(./—a) is real. Let R be a real 
closure of K(\/—a). By Theorem 7.2.3 the field R has an ordering under which 
the element —a = (./—a)? is positive, i.e., a is negative. The restriction of this 
ordering onto K C R is the ordering desired. 


Denote the algebraic closure of R by R 


Theorem 7.2.5. The field R is real closed if and only if R 4 R and R = 
R(V-1). 


Proof. First, suppose that R is real closed. Then the equation 2? + 1 = 0 
has no solutions in R. So R # R. Let us prove that R = R(V/—1). 

For brevity, set i = //—1. First we show that any second order equation 
has a root in a ;). The roots of the quadratic x? + 2px + q are given by the 
formula 21.2 = —p + \/p? —4q, and so it suffices to prove that, if a,b € K, 
then /a+ib € K (i). In other words, we have to select c,d € K, so that 
(c+ di)? =a+ bi, ie., c? —d? =a and 2cd = b. Clearly, a? + b? = (c? + d?). 


Therefore 
2. V@+h +a paver -a 
— [= 2 

looks like an appropriate choice. Let us verify first of all that numbers c and d 
thus defined actually belong to R. Since a? > 0 and b? > 0, then a? + b? > 0, 
o Va? +b? € R by Theorem 7.2.3. Further, a? +62 > Va? > +a (here 
Va? +b? and Va? are non-negative numbers). Therefore a2 +b? + a > 0, 
ie., c and d belong to R. The equality c? — d? = a is automatically satisfied 
and the equality 2cd = b will be satisfied if we correctly select the signs of c 
and d. 

Let us now consider an arbitrary polynomial f irreducible over R. Let 
us express its degree n in the form n = 2 q, where q is odd. We prove by 
induction on m that f has a root in R(i). For m = 0 this follows from Theorem 
7.2.2. Now suppose that m > 0 and the statement required is already proved 
for 1,...,m—1. 

Let a1,..., Qn, be all the roots of f; they belong to some extension of R. Se- 
lect c € R so that all the numbers aga; + c(ax +az), where k ¥ 1, are distinct. 


-1 
These numbers are the roots of a polynomial g of degree mnt) with coeffi- 


cients from R (the coefficients belong to R since they are symmetric functions 
-—1 
mn) is of the form 2’"~!q(n — 1), 
where g(n — 1) is odd. We can therefore apply to g the induction hypothesis. 
Without loss of generality we may assume that the root aja2 + c(a1 + a2) of 
g belongs to R(i). 
Let us prove that R(ajia2, a1 + a2) = R(aja2g + c(ay + a2)). To this end, 
consider the polynomial F' with the roots a,a; and the polynomial G with 
the roots ax + a;. The coefficients of F and G belong to R. Let 


in Q1,...,Q@,). The number deg g = 


6 = ayaa + clay + a2). 
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Clearly, G(a1 + a2) = 0 and 
F (@—c(a, + a2)) = F(azaz2) = 0, 


Le., a1 + ag is a common root of the polynomials G(a) and F'(@ — ca) with 
coefficients in R(). The condition 6 —c(az,+a1) A ana for k,l # 1,2 implies 
that a, + a2 is the only common root of these polynomials. Moreover, this 
common root is a simple one since it is a simple root of G. Hence, the greatest 
common divisor of G(«) and F(@ — cx) is of the form x — (a; + ag). The 
coefficients of these polynomials lie in R(@), so a1 + ag € R(#) and 


a1Q2 = 6—c(a, + a2) € R(O) = R(ara2 + clair + a2)). 


So we have proved that R(a1a2,a1+a2) C R(a,a2+c(a;+az2)). The opposite 
inclusion R(a a2 + c(ay + a2)) C R(aya2,a1 + a2) is obvious. 

Thus, a1Q2, a1+a2 € R(aia2 + clay + a2)) and R(a,a2 + clay + a2)) C 
R(t). Therefore a; and ag are the roots of a second order polynomial with 
coefficients in R(i). But we have already proved that the roots of any second 
order polynomial with coefficients from R(i) belong to R(i). Therefore f has 
a root ay which belongs to R(i). This means that R = R(é). 

The converse statement (if R 4 R and R = R(i), then R is real closed) 
is much easier to prove. There are no intermediate fields between R and R = 
R(2), and so it suffices to prove that R is real, i.e., —1 is not the sum of squares 
of elements of R. By the hypothesis i ¢ R, ie., —1 is not a square. Thus, it 
suffices to prove that in R the sum of squares is a square itself. Since R(7) 
is algebraically closed, it follows that, if a,b € R, then Va+ bi € R(2), ie., 
there are elements c and d in R such that a + bi = (c + di)”. In this case, we 
also have a — bi = (c — di)”. Therefore 


a” +b? = (a + bi)(a — bt) = (c + di)?(c — di)? = (c? + d”)?. 


It remains to observe that, if the sum of two squares is a square itself, then 
the sum of any number of squares is also a square. 


7.2.2 Sylvester’s theorem for real closed fields 


Let K be an ordered field, and f an irreducible polynomial over K. We call a 
real closed field R D> K a real closure of K if R is algebraic over kK and the 
ordering of K induced by the ordering of R coincides with the initial ordering 
of K. 

The number of distinct real roots of a real polynomial can be computed 
without going outside R. This can be done by two methods: with the help of 
Sturm’s theorem or with the help of Sylvester’s theorem. Both these theorems 
can be proved for any ordered field. The most important corollary of this 
situation is as follows: 
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if f is a polynomial over an ordered field Kk, then in any real closure R of 
kK the number of the roots of f is the same. 


Initially, Artin’s theory was based on Sturm’s theorem; for a modernized 
construction of Artin’s theory based on Sturm’s theorem, see the book [La2]. 
Sylvester’s theorem, however, is in many ways more convenient. Following 
[Sc1] we give the solution of Hilbert’s seventeenth problem based on Sylvester’s 
theorem. 

Over an ordered field K, the signature of a quadratic form y is defined as 
follows. Over kK, a quadratic form y can be reduced to the form 


p(a) = Mat feet Dug (1) 


As for R, the number of positive coefficients \; and the number of the negative 
ones do not depend on the way we reduce the form y to the canonical form 
(1). Indeed, suppose that there exist two decompositions 


Vi 0V_OKW=WiOW_OW. 


Then Vi. N(W_@®Wo) = 0 so dim V, +dim W_+dim Wo < 0. Hence dim V, < 
dim W... Similarly, dim W 4 < dim V_. 

Let R be a real closure of kK. Theorem 7.2.5 implies that the degree of 
any irreducible over R polynomial is equal to 1 or 2. Therefore an obvious 
modification of the arguments used in the proof of Theorem 1.4.5 (see page 33) 
enables one to prove the following statement. 


Theorem 7.2.6. Let f be a polynomial over K and y(a, y) a bilinear sym- 
metric form on the space K[x]/(f) equal to the trace of the operator of mul- 
tiplication by xy. Then the signature of ~ is equal to the number of distinct 
roots of f which lie in the real closure R of K. 


In particular, as we have already mentioned, the number of the distinct 
roots of f is the same for all real closures of K. 
The form y will be called the trace form of the space K[a]/(f). 


Theorem 7.2.7 (Artin-Schreier). Let K be an ordered field, and let R 
and R’ be real closures of K. Then there exists precisely one isomorphism 
ao: R= R’ over K and this isomorphism preserves the ordering. 


Proof. In the real closed field, the condition z > y is equivalent to the 
fact that 2 — y is a square. Therefore any isomorphism o: R — R’ preserves 
the ordering. 

The field R is algebraic over kK, and so any element a € R is a root of 
an irreducible polynomial f over K. Theorem 7.2.6 implies that f has the 
same number of roots in R and in R’. Let these roots be ay < --: < a», and 
a, < +++ <a’, respectively. In R, select elements t; so that t? = ajz1 — a. 


By the theorem on a primitive element, 
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K(qa1,...,Qn,t1,.--;tn—1) = K(6), 


where 6 € R is a root of an irreducible over K polynomial g. In R’, the 
polynomial g has the same number of roots as in R. In particular, it has 
a root 0’ € R’. Over K, there exists an isomorphism K(0) > K(6’). This 
isomorphism is an embedding 


ao: K(a4,...,Qn,t1,..-,tn-1) > R. 


It is easy to verify that o(a;) = a‘. Indeed, o sends a root of f into a root 
of f, so that o(ai41) — o(aj) = o(t?) > 0. On K(ay,...,Qn), the map a is 
uniquely defined. In particular, the image of a is uniquely defined. Now, with 
the help of Zorn’s lemma, we may construct a uniquely defined isomorphism 
between R and R’ over K. 


Let us prove now that every ordered field has a real closure. 


Theorem 7.2.8. Let K be an ordered field, and K' its extension in which 
there are no relations of the form —1 = > 2;a?, where \; are positive ele- 
ments of K anda; € K'. Then the field L obtained from K' by adjoining the 
quadratic roots of all the positive elements of K is a real field. 


Proof. Suppose on the contrary that the field L is not real. Then —1 = 
S> b?, where b; € L. Therefore, in L, we have the relation of the form —1 = 
S> \ib?, where the \; are positive elements of K and b; € L. By the hypothesis 
not all the 6; can simultaneously lie in AK’. Therefore there exists a least 
positive integer r for which a relation of the form indicated holds with b; € 
Kk’ ieee ..+54/fir), Where [1,.--, [lr are positive elements of K. 

Let us express 6; as bj = 2; + Yi,/r, Where 2;,4; © K"(./fin,---,./Mr—1)- 
Then 


—1= Sloe + yi)? = D6 As(a? + yP/ fr) + 2/ tie SS way. 


If So aiyi A 0, then ua, € K'(.\/,..-,./Mr—1), which contradicts the as- 


sumptions on the minimality of r. Therefore 


-l= Ds ix? + .3 NibbrY? 


where \; and A;[4, are positive elements of K and 2;, y;¢ K'(\/t,..-,./—r—1)- 
This also contradicts the assumptions on the minimality of r. 


Corollary 1. Any ordered field K has a real closure. 


Proof. Set K' = K.In K, there are no relations of the form —1 = > A;a? 
with A; positive. Therefore the field L obtained from K by adjoining square 
roots of all the positive elements of K is real. The real closure of L is the 
required real closure of K. 
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Corollary 2. Let K be an ordered field and let K' be an extension of K. 
An ordering of K can be extended to K' if and only if there are no relations 
of the form —1 = > da? in K’, where the d; are positive elements of K and 
a; Ee K’. 


Proof. Clearly, if a relation of the form indicated exists, then it is impos- 
sible to extend the ordering from K to K’. Suppose that there are no such 
relations. Then one can construct a real field L > K’. Consider its real closure 
R. The ordering of R induces an ordering of K’ required. 


7.2.3 Hilbert’s seventeenth problem 


In this section we will finally prove that, if a real rational function r(x1,...,@n) 
is non-negative for all real values of 71,...,2%n, then it can be represented as 
the sum of squares of real rational functions. This proof holds not only for R 
but for an arbitrary real closed field R. The proof is easy to deduce from the 
following rather difficult theorem. 


Theorem 7.2.9 (Artin—Lang). Let R be a real closed field and let K = 
R(a@1,...,%n) be an ordered finitely generated extension of R such that the 
ordering of K is compatible with the ordering of R. Then there exists an R- 
algebra homomorphism 

yp: Ri[r1,...,tn] 9 R 


identical on R. 


Proof. First, consider the case when the transcendence degree of K over 
R is equal to 1. We may assume that the element x; = x is transcendental 
over R. Then K is a finite algebraic extension of R, and so by Theorem 21.5 
(on a primitive element) K = R(«)[y]. Precisely the same arguments as in the 
proof of Noether’s lemma on normalization (see Lemma on page 222) show 
that one can select y which satisfies the equation 


y' + e(x)y’ 1 +---+ (x) = 0, where ci(x),..., (x) € Riz]. 


We assume here that | is the least possible. 
Consider a polynomial 


F(X, Y) = ¥Ytae(xjy' 4 +-+-+¢(X) 


in indeterminates X and Y. To any pair of elements a,b € R such that 
f(a, b) = 0, there corresponds an R-algebra homomorphism o: R[x, y] > R 
given by the formulas o(”) = a and o(y) = b. Let us show that there are 
infinitely many such pairs a, b € R. os 

Let Rx be a real closure of K. The polynomial f(Y) = f(#, Y) € R(«x)[Y] 
has a root y in Rx, and so by Theorem 7.2.6 the signature of the trace form 
yy on the space 
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R(2)[Y]/(f) © R(a)[y]/R(@) = K/R(a) 


is positive. This form can be reduced to the diagonal form with the elements 
hi(a),...,hs(a) € R[x] on the diagonal. 

Over a real closed field R, any polynomial factorizes into linear and irre- 
ducible quadratic factors. The quadratic factors are of the form (x +a)? + 87, 
where a, € R. They are therefore positive as elements of R(x), and for all 
a € R the elements (a + a)? + 6? € R are also positive. 

Let «—A41,...,”—,; be all the distinct linear factors in the factorizations 
of the polynomials h(x),...,hs(a). Let us order the elements 7, A1,..., A € 
R(x). The following possibilities might occur: 


NSM ESAG <2. Or SNH Or EKA K... 


Let a be an arbitrary element of R satisfying, respectively, inequalities A; < 
a <A; or Ay < aor a <4, (there are infinitely many such elements a). Then 
the signs of hy(a) and h;(x) coincide for all k = 1,...,s. The signature of 
the form ¢ is therefore equal to the signature of the form with the diagonal 
elements h(a),..., h(a). 

Let us show that, for almost all a, the trace form y, on the space 
RIY|/(f(a,Y)) can be reduced to the diagonal form with the elements 
hi(a),...,As(a) on the main diagonal. Indeed, let A(x) = (a;;(x)) be the 
matrix of the form y in the basis 1,Y,...,Y'~' and B(x) = (bi;(x)) the 
matrix such that 


(B(z))? A(x) B(2) = diag (hi(x),...,hs(x)). 


Then, if det B(a) 4 0 and none of the denominators of the rational functions 
b;;(a) vanishes at x = a, we have 


(B(a))” A(a)B(a) = diag (hi(a),...,hs(a)). 


It remains to observe that A(a) is the matrix of the form ya in the basis 
1,yY,...,¥1. 

Thus, there exist infinitely many elements a € R for which the signature 
of the form y, is positive. For all such a, the polynomial f(a, Y) has a root 
be R,ie., f(a,b) = 0. As we have already observed, to any such pair there 
corresponds an R-algebra homomorphism o: R[x, y] > R. Let us show that 
almost all such homomorphisms can be extended to 


Rix, y,2,---,%n] D Riz, y). 


Recall that v2,...,%, € R[x1,...,¢%] = K = R(x)[y], and therefore 1; = 

Pi(@y) where p; and q; are polynomials. Let g = q.-...- dn. By construction 
Gi& 

a (q(x)) = g(a) 4 0 for almost all a. In these cases the homomorphism o can 

be extended to 
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1 
Rix, y| |—~| > Ria, y, v2,.--,2n] D Riai,..., an] = K. 
q(x) 


The passage from the case when the transcendence degree of K over R is 
equal to 1 to the case of transcendence degree m > 1 is performed by a simple 
induction on m. Suppose that the existence of the homomorphism required 
is proved for all fields kK whose transcendence degree over R is strictly less 
than m. Consider a field K = R(a1,...,%n) whose transcendence degree over 
R is equal to m. Select an intermediate field F: R C F C K for which the 
transcendence degree of K over F is equal to 1. Let Rr C Rx be real closures 
of F and Kk. The transcendence degree of Rx over Rp is equal to 1, and so 
there exists an Rp-algebra homomorphism w: Rr[x1,...,Un] > Re. 

The transcendence degree of Rr over R is equal to m — 1, and so the 
transcendence degree of the field R(w(21),...,W(an)) C Rr over R does not 
exceed m— 1. It is also clear that an ordering of Rp induces an ordering of the 
field R (w(a1),...,W(@n)). Therefore there exists an R-algebra homomorphism 
a: R[(x1),...,(an)] > R. The restriction of the composition of the maps 
w and o onto the subalgebra 


Rix1,...,¢n] C Re[v,..., tn] 


is the homomorphism required. 


Now it is easy to prove Artin’s theorem on non-negative rational functions. 
Let & be an ordered field. The rational function 


F(Cijna.5 tn) = DBs) where p,q €k[a1,..-,2nl, 
q(x1, obit , En) 
is called non-negative if r(ay,...,@n) > 0 for all aj,...,a@, € & such that 


q(a1,---,;@n) #0. 


Theorem 7.2.10 (Artin). Let R be a real closed field, letr € R(a,...,%n) 
be a non-negative rational function. Then r can be represented as the sum of 
squares of elements of R(a1,...,2n)- 


Proof. Suppose on the contrary that r is not the sum of squares of el- 
ements of the field R(a1,...,2). Then by Theorem 7.2.4 there exists an 
ordering of the field R(a1,...,2,,) for which r < 0. Let us express r as an 
irreducible fraction p/q, where p,q € Ria1,...,@p]. Consider an R-algebra 


BR Bisset —<—$—$— 
q(x1, sie pr) 
that contains Tr. 

In R,, the real closure of the ordered field R(a1,...,@n), there exists an 
element y such that 7? = —r > 0. The field R(x1,...,%n,7) is contained in 
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R,, and therefore the ordering of R, induces an ordering of R(2,...,2n,7)- 
By the Artin—Lang theorem there exists a homomorphism 


iF 1 
of -| = R, 


e2 R [,.--y ty, 
GU biy.s8 4B) of 


identical on R. 
Clearly, y(7) (4) = land y(q) (2) = 1. Hence y(y) 4 0 and y(q) 4 0. 
Hence also y(r) = —y(72) = — (y(y))? < 0. But 


, where a; = y(2;). 


Here q(a1,.--,@n) = 9(q) # 0. The inequality r(a1,...,an) < 0 contradicts 
the hypotheses of the theorem. 


For an arbitrary (i.e., not necessarily real closed) ordered field, Artin’s the- 
orem is false (the first example of such a field is given in [Dul]; a simpler proof 
can be found on page 86 of the book [Pf2]). However, with a slight sharpening 
of the proof of Theorem 7.2.10, we can prove the following statement for an 
arbitrary ordered field k. 


Theorem 7.2.11. Let k be an ordered field and R its real closure. If a rational 
function r € k(a1,...,%n) ts such that r(a1,...,@n) > 0 for allay,...,@, ER 
(if, of course, the value r(a1,...,@n) is defined), then r can be represented as 
the sum of squares of elements of k(a1,..., an). 


Proof. If r is not the sum of squares of elements of k(a1,...,%p), then 
there exists an ordering of the field k(a1,..., 2) for which r < 0. Let R’ bea 
real closure of the field k(x1,...,2@,) with such an ordering. We may assume 
that RC R’ (the real closure of the field k(a1,...,2»,) contains a real closure 
R, of k, and R; and R are isomorphic as real closures of the same field). In 
R’, there exists an element y such that y? = —r > 0. In the R-algebra 


1 


a 
q(a1,---,2n) 


1 
iP oa Cc Ht 
Y 


we introduce an ordering induced by the ordering in R’. The remaining argu- 
ments are precisely the same as in the proof of Theorem 7.2.10. 


Corollary. Let the ordering of the field k be such that the condition 


“r(a1,...,0n) < 0 for all 11,...,¢, € k” implies that r(a1,...,¢%n) < 0 
for all 21,...,%, € R, where R is a real closure of k. Then Artin’s theorem 
for k holds. 


In particular, Artin’s theorem holds for the field of rationals Q, i.e., any 
non-negative rational function with rational coefficients is the sum of squares 
of rational functions with rational coefficients. 
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In conclusion we give formulations of two interesting theorems whose 
proofs are based on Artin’s theorem. More precisely, the first of these the- 
orems is deduced from Artin’s theorem and the second follows from the first. 


Theorem 7.2.12 ([Ri]). Let I be an ideal in the ring Ria1,..., Up]. Then 
the following conditions are equivalent: 

(1) Any polynomial that vanishes at all the common real zeros of the poly- 
nomials from I belongs to I itself; 

(2) If the sum of squares of the polynomials from the ring R[a1,...,2n] 
belongs to I, then these polynomials themselves belong to I. 


Theorem 7.2.13 ([St1]). The homogeneous form F is non-negative if and 
only if there exists a homogeneous polynomial relation of the form y(—F) = 0, 
where 

plu) = wt + aya?” +--+ a2 


and the coefficients a1,...,@2n are the sums of squares of homogeneous forms. 


7.3 Pfister’s theory 


7.3.1 The multiplicative quadratic forms 


In this section we consider quadratic forms over an arbitrary field k whose 
characteristic is distinct from 2. Any quadratic form y on the space k” is 
given by a symmetric n x n matrix A. Namely, if x = (@1,...a@,) € k”, then 


p(x) = xAx™ =. LL; Aj; 


The quadratic forms y and w are called equivalent if there exists a non- 
singular n x nm matrix P such that v(x) = 7(Px). Then w is given by the 
matrix PAP’. For equivalent forms we write y & w. 

We say that the form y represents an element a € k if a = v(x) for some 
x € k”. For example, the form y(x) = 27 +--- + 2? represents a if and only 
if a can be represented as the sum of n squares. 

The main tools of Pfister’s theory are multiplicative forms of a particular 
type he introduced. Recall that the quadratic form y is multiplicative if the 
forms vy and ay are equivalent for any nonzero element a representable by yp. 

Any multiplicative form y possesses the following remarkable property: if 
a and b are representable by y, then ab is also representable by y. Indeed, 
equivalent forms represent the same set of elements of the field k. Now if a 
multiplicative form y represents a and 6, then the form ay equivalent to yp 
represents ab. 

We introduce the following notations. Denote 


2 2 
(@1,-+.40n) = 010] +++ + Opa). 
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For the forms y = (a1,...,@m) and w = (b1,...,b,), we define 


pev= (Q1,---;@m,b1,.--,0n), 
pev= (arb1,...,Q@m01, A1b2,...,Amb2,..-,@1bn,..-, Amn). 


Pfister introduced the forms 
((a1,--+-;4n)) = (1,41) ® (1, a2) ®---@ (1,apn), where a,---an #0, 


which play the main role in his theory. The most interesting for us is the form 


((1,...,1)), ie., the sum of 2” squares. However, in the proof by induction on 
—— 
n 
n we cannot avoid considering the general case ((a1,...,@n)). 


Theorem 7.3.1 (Pfister). The form ((a1,...,@n)) is multiplicative. 


Proof. We consider first the case n = 1. Let the form ((a1)) = x7 + a173 
10 re 
ASE l = 2 2 = = = 1 2 
represent b ¥ 0, i.e., b = cj + aicg. Set A ({ ) and P e ) It 
is easy to verify that PAP? = bA and det P 4 0. This means that ((a1)) = 
b((a1)). 
The form ((a1,...,@n)) is of the type p®any, where yp = ((a1,...,@n—1)). 
It therefore suffices to prove that if y is multiplicative and a ¥ 0, then the 
form y @ ay is also multiplicative. Let b = p(x) + ay(y) = €+an 4 0, where € 
and 7 are representable by y. If € = 0, then 7 4 0, and hence, ny = y. Hence 
again 


b(y @ ay) = any ® ay) Sap Gay = yp Gay, 
since a2p & y. The case 7 = 0 is similarly considered. It remains to consider 
the case n # 0. In this case py & Ey and y & €-1y & (né—1)y. Hence 


b(y © ay) = (€ + an)(y B ay) & (1 + an€")(y ® (an€*)y) 
= (1+ an€~*)((an€—")) @ ¢. 


When we considered the case n = 1 we showed that the form ((an&~')) is 
multiplicative. This form represents an element 1 + ané~! = b€~! 4 0. Hence 
(1 + an§—*)((an€~*)) = ((an€~*)), and therefore 


b(y © ay) = ((ané")) @ p = YO (anE")yp = v Gay, 


as was required. 


In the proof of Pfister’s theorem we did not use the fact that chark 4 2, 
and therefore it is true for any field. The case where a, = --- = Gn = 1 is 
particularly interesting. In this case Pfister’s theorem implies that 


in any field the product of elements each representable as the sum of 2" 
squares is also representable as the sum of 2” squares. 
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For n = 1,2,3, there are explicit general formulas for the representation. 
For example, when n = 1 and we have the sum of two squares, the corre- 
sponding identity is 


(xi + @3)(y7 + ys) = (z1y1 + 2y2)? + (e1y2 — 2291)”. 


But for n > 3 there are no identities of similar form. 
We will need the following auxiliary statement. 


Lemma 7.3.2. Let us represent the form 
p= ((a1,.-.,@n)), where ay-+-a, £0, 


as yp = (1) @ gy’. Let vy" represent an element bi 4 0. Then vy & ((bi,..-,bn)) 
for some nonzero b2,..., bn. 


Proof. we use induction on n. For n = 1, the form ¢’ is of the type a, 23. 
If by = ajc’, then bjx2? = ay(cx1)? & ay2?. Thus (a1) & (bi), and therefore 
((a1)) & ({b1)). 
Now suppose that the statement required is proved for all the forms of the 
type ((c1,---,;Cn—1)), Where c1-+-Cn—-1 # 0. To prove the statement required 
for the form y = ((a1,...,@n)), consider the form 


b= ((a1,---,4n-1)) = (1) Oy’. 


Clearly, p = w@anwv and y’ = w'Ga,w. Therefore the element b; representable 
by the form y’ can be expressed as b; = bi + anb, where the elements band 
b are represented by the forms 7’ and w, respectively. 

First, let b = 0. In this case, bf = b;. By the induction hypothesis w & 
((b1,-.-,0n—1)), and therefore yp = ((b1,...,0n—1,4n)). 

Now consider the case b 4 0. In this case, the form ~ represents a nonzero 
element b. By Pfister’s theorem, the form wW is multiplicative, and so w & by. 
Hence 


g! =" ® (anb) (bY) = W O en, 
where Cy, = a,b. Here 6) = bi + cy. Let b{ = 0. Then b; = c, and 


pS ((Cn)) @ b= ((Cn;Q1,--- > An— 1) = ((b1, Q1,---,@n—1)). 


It remains to consider the case when b; = bi, + cy and bic, # 0. The form 
w’ represents b/,, and so by the induction hypothesis w & ((b4, be,...,0n—1))- 
Hence, 


R 


((b), ba, Sie bn— 1; €n)) _ (CB Ons Oa, nee .;0n—1)) 
((b1,en)) ® @ ((ba,..-5bn-a)) 


II 
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Let A = bi and w = cp. Then A+ pw = b; 4 0. The identity 
& Gale y= (a" 0 ) 
mu—A}) \O uw) \1-A 0 (A+p)Au 
shows that (b4,¢n) © (b1, b1b) cn). Hence 
((b1,€n)) = (1, dL, ns Bren) = (1, b1, Bien, biden) = ((b1, byen)). 
Set b, = bien. Then 


p = ((b1, bn)) ®@ ((b2,.--,0n)) & ((d1,.--, bn). 


7.3.2 C;-fields 


In the preceding section we met one of the tools of Pfister’s theory — multi- 
plicative forms. The other tool of this theory — C;-fields — was developed in 
1933-1936 by the Chinese mathematician Tsen Chiungze and rediscovered in 
1952 by Lang. 

The field K is called a C;-field if any system of homogeneous polynomials 


fis sang tr E K[ai,...,2n] 
such that di +---+d! <n, where d, = deg fs, has a common non-trivial zero. 


Theorem 7.3.3 (Tsen-Lang). If the field L is algebraically closed, then the 
field K = L(ty,...,ti), t.e., the field of rational functions in i indeterminates 
over L, is a C;-field. 


Proof. We use induction on 7. For i = 0, the field K coincides with L, 
ie., K is algebraically closed and the condition di +--- +d‘. <n means that 
r <n. In this case the statement required coincides with the homogeneous 
Hilbert’s Nullstellensatz (Theorem 6.1.10 on page 231). 

The inductive step consists in the proof of the fact that, if Z is a Cj-field, 
then K = L(t) is a Cj41-field. Let fi,..., f-€ K[a1,...,%n] be homogeneous 
polynomials such that dj*t +---+ dit! <n, where d, = deg f,. We have to 
prove that f1,...,f; have a common non-trivial zero in kK. The coefficients 
of the polynomials f1,..., f, belong to K = L(t), i.e., they are rational func- 
tions in t over L. Having multiplied all these coefficients by an appropriate 
polynomial from Lt] we may assume that the coefficients of the polynomials 
fi,.--, fr belong to L{t], i-e., these coefficients are polynomials in t over L. 
The degrees of these polynomials are bounded by a number m, and so any 
coefficient @ can be expressed as 


a=agptayt+:::+amt™”, where ao,...,@m EL. (1) 
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For p= 1,...,n, define 


Lp = Lpo9 + Lpit+-+-+ Lpst", (2) 
where £p0,-.-,2ps are independent variables over L and s is sufficiently 
large (we will determine this number shortly). In every homogeneous form 
fi,---,fr, replace the coefficients and indeterminates by expressions (1) and 


(2), respectively. As a result, the form f; takes the form 
got git +--+ gnt’, 


where N = sd; +m and go,...,gn are the forms of degree d; in (s + 1)n 
variables %pq over L. By the induction hypothesis all the forms g for the 
polynomials f;,..., f; have a common non-trivial zero if 

So (sdj +m+1)d < (s+ 1)n, 

j=l 
i.e., 

(m+ yg —n<s(n— phage 

By the hypothesis we have n — ya > 0, and so the inequality required 
holds for sufficiently large values of s, such as s > (m+ 1) od‘. 


7.3.3 Pfister’s theorem on the sums of squares of rational functions 


In the two preceding sections we have prepared the main tools for the proof of 
Pfister’s theorem. We will also need the following property of quadratic forms: 
if the non-degenerate quadratic form y over K is isotropic (i.e., p(u) = 0 for 
some u # 0), then it is universal, i.e., it represents all the elements of the 
field Kk. Indeed, to any non-degenerate quadratic form y there corresponds a 
non-degenerate bilinear symmetric form 


1 
f(x,y) = 5 (v2 +4) — vlz) — oy). 
Therefore there exists a vector v such that f(u,v) = 1. In this case 
plu + Au) = y(v) + p(Au) + 2f(v, Au) = v(v) + 2A. 


For any b € K, we can select a \ such that y(v) + 2A = b. 

Theorem 7.3.3 implies that, if D is algebraically closed, then any non- 
degenerate quadratic form y in 2” variables over K = L(t),...,tn) is univer- 
sal. Indeed, let r € kK. Consider an auxiliary quadratic form 


Blu, t) = ou) — rt? 


in 2” +1 variables. By Theorem 7.3.3 the form @ has a non-trivial zero (uo, to). 
If to = 0, then y(uo) = 0 for some up 4 0. This means that ¢ is isotropic, and 


u 
therefore universal. If to 4 0, then y ~) =r, 1.e., y represents r. 
0 
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Theorem 7.3.4 (Pfister). Let R be a real closed field; let p = ((a1,...,@n)) 
be a non-degenerate quadratic form in 2" variables over R(a1,...,Un). If b is 
the sum of squares of elements R(x1,...,%n), then p represents b. 


Proof. If the form y is isotropic, then it is universal. We may therefore 
assume that y is anisotropic, i.e., p(u) # 0 for u 4 0. By the hypothesis 
b = b? +---+?,. Let us use induction on m. For m = 1, the statement is 
obvious, since any multiplicative form represents 1, and hence it represents 
b} - 1 as well. 

Now let m = 2, ie., b = 6? + b3, where bibg 4 0. Let L = R(i) be the 
algebraic closure of R. Let 6 = by + ibe generate the field L(a1,...,@p) over 
the field R(a1,...,%n), ie., 


L(a1,...,2n) = R(a1,...,¢n)(G). 


The field R is real, and soi ¢ R(a1,...,2%,) and @ ¢ R(a1,...,2n). 

The form y can be also considered as a form over the field L(a1,...,%n) D 
R(a,..-,%n). Over this field y is universal since L is algebraically closed. 
Therefore there exist 2”-dimensional vectors u,v over R(#1,...,2n) for which 
plu + Bv) = B, ie., 


g(u) + 268f(u,v) + Bou) = B, (1) 
where f is a bilinear symmetric form corresponding to y. Here v ¥ 0 since 
otherwise 6 = y(u) € R(a1,...,%n). Since » is anisotropic, it follows that 


y(v) #0. 
An irreducible over R(21,...,2%n) equation for 3 is of the form (3 —b,)? + 
b2 = 0, ice., 


6? —2b,8+b=0. (2) 
Comparing (1) and (2) we deduce that b = y(w)/y(v). Since y is multiplica- 
1 1 
tive, it follows that y represents both and the product of y(u) and —_, 
lv) y(v) 


ie., b. 

Now suppose that the statement required is proved for some m > 2, i.e., 
any form ¢ of the type y = ((a1,...,@n)) represents any element of the type 
b} +---+62,. We have to prove that the form y represents any element of the 
type bf +---+6?, + b2,,,, where b41 4 0. Let us express this element in the 
form b?,,,(b+ 1), where b is the sum of m squares. It suffices to prove that 
the form y represents an element c= b+ 1. We may assume that c 4 0. 

To make use of Lemma 7.3.2 (see page 272), we express the form y as 
yp = (1) @ y’. By the induction hypothesis y represents b, i.c., b = 2 + 0, 
where D! is represented by y’. Consider the multiplicative form w = y@ (1, —c) 
in 2”*! variables. Clearly, 


p= (1) e@¢'8(-c)p= Hey, 


where the form w’ = y’ © (—cy) represents the element 
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b! —c=(b— 2) — (1 +b) = -1- BR. 


Here —1— 62 #0 since i ¢ R(zj,...,27). In this case we may apply Lemma 
7.3.2 to the form w and the element —1— 62. 
As a result, we find that there exist non-zero elements c),...,¢n € 


R(#1,.--,%n) such that 
0) = ((-1 — b6,¢1,- cai nad ys 


ie., p & (-1— 62) @x =x @(—-1—be)x, where y & ((c1,...,Cn))- 

Let us now apply the induction hypothesis again, this time to the form y. 
The element 1+ 2 is the sum of not more than m squares, and so the form y 
represents it. It then follows from the multiplicativity of y that y & (1+02)x, 
and therefore 


y ® (—cy) = v = x @ (-1 — bp) x & (1 + bg) x & (—1 — bx. 
Let € = (1+ 63)x. Then 


p(Px + Qy) — cp(Ra + Sy) = (x) — E(y), 


2) = U isa non-singular 2” x 2” matrix. By setting x = y we find 


p(Ax) = cp(Bx), where A= P+Qand B=R+S. 


Since U is non-singular, it follows that, if « # 0, then (Az, Ba) F (0,0). 
If at least one of the matrices A and B were singular, the form ~ would be 
isotropic, whereas we are considering an anisotropic form. Since therefore both 
the matrices are invertible, g = cy, and therefore the multiplicative form y 
represents the element c. 


By Artin’s theorem any non-negative element of the field R(a1,...,2n), 
where R is a real closed field, is the sum of squares. The multiplicative form 
((1,...,1)) represents, therefore, any non-negative element of R(a1,...,2n), 

—— 


n 
i.e., any non-negative element of this field can be represented as the sum of 
2” squares. 

On page 251 we gave example of an element of R(a1,...,2,,) that cannot 
be represented as the sum of n squares. Therefore, if N is the least number for 
which any element of the field R(a1,..., a) can be represented as the sum of 
N squares, thenn+1< N < 2”. 

In the general case no other estimates on N are known. Only for n = 2 it 
is known that N = 4. This statement is proved in two distinct ways but both 
the proofs are rather complicated. One of the proofs uses the theory of elliptic 
curves over the field C(x) (see [Ca6] and [Ch2]). The other proof is based on 
the Noether—Lefshetz theorem which states that 
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on any generic surface of degree d > 4 in three-dimensional projective 
space, any curve is singled out as the intersection with certain other surface 
(see [Col]). 


In the proof of Pfister’s theorem one essentially uses the fact that R is 
real closed. For Q, Pfister’s theorem is false. This is clear already for n = 0. 
Indeed, not every positive rational number is the square of a rational number. 
But for n = 0 the estimate required is known: 


every positive rational number is the sum of squares of four rational num- 
bers. 


Indeed, by Meier’s theorem (see [BS], [Ca7] or [Sel]) any non-degenerate 
quadratic form over Q in n > 5 indeterminates non-trivially represents 0 if it 
represents 0 over R. In particular, if r > 0, the form 


24 
T 


2 
Ty 


x3 xr 2 TL 


represents 0 over Q. 

For n = 1, i.e., for the polynomials over Q in one indeterminate, the first 
to obtain the estimate was E. Landau in the 1906 paper [Lal]. He showed that 
any non-negative polynomial (in one indeterminate) with rational coefficients 
can be represented as the sum of 8 squares of polynomials with rational coef- 
ficients (a simple proof of Landau’s theorem is given in Chapter 7 of the book 
[Pf2]). But this estimate is not an exact one. The exact estimate was obtained 
by Pourchet [Po3]. Every non-negative polynomial over Q can be represented 
as the sum of squares of 5 polynomials over Q. Chapter 17 of the book [Ra3] 
is devoted to the detailed exposition of Pourchet’s theorem. 

For polynomials with certain special properties more precise results can be 
obtained. For example, in [Da3] it is shown that, if the values of a polynomial 
f(x) at all integer, x are the sums of squares of two integers, then f(a) is the 
sum of squares of two polynomials with integer coefficients. 

An example of a non-negative polynomial which cannot be represented over 
Q as the sum of squares of four polynomials is sufficiently easy to construct. 
To see this, suppose that 


4 
ax? + be +c= Ge + bs)?. 


s=1 


Then 4ac — b? is the sum of squares of three rational numbers. Indeed, 


tac - 8° = 4 (S702) (30H) -4(S0a.b,)° 


and, if we consider the product of quaternions a, + agi + a3j + a4k and 
b, — boi — baj — bok, we see that 


(~~ a?) (~ 0?) = (~ asbs) plus the sum of three squares. 
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Now it is easy to show that the quadratic x? + «+4 cannot be represented 
as the sum of squares of four polynomials. In this case 4ac — b? = 15 and we 
have to prove that 15 cannot be represented as the sum of squares of three 
rational numbers. If 15 were equal to p? + q? + r? then, after reducing to the 
common denominator, we would have obtained the congruence 


a? +b? +c? =15d? (mod 8)=-—d? (mod 8), 


where at least one of the numbers is odd. But such a congruence is impossible. 


8 


Appendix 


8.1 The Lenstra-Lenstra-Lovasz algorithm 


In 1982, in the paper [Le2], there was suggested an algorithm called The 

Lenstra-Lenstra-Lovasz algorithm, or the LLL-algorithm for short. It enables 

one to factorize any polynomial over Z into irreducible factors over a polyno- 
n n 

mial time. More precisely, if f(z) = > aja’ € Z[a] and |f| = ,/ > a?, then, 
i=0 i=0 

to factorize f into irreducible factors with this algorithm, one needs not more 

than O (n'* + n°(In|f|)*) operations. 

The LLL-algorithm is of huge theoretical interest, but practically it is no 
more effective than a comparatively simple algorithm we described in sec- 
tion 2.5.2. At the outset both algorithms work similarly: we factorize f into 
irreducible factors modulo a prime p with the help of Berlekamp’s algorithm, 
and then with the help of Henzel’s lemma we compute, with certain accuracy, 
a p-adic irreducible factor h of f. After that the LLL-algorithm proceeds dif- 
ferently: for h, we seek an irreducible divisor ho of f in Z[a] divisible by h 
modulo p. The condition of divisibility of ho by h means that the coefficients 
of the polynomial ho are the coordinates of points on a lattice, and the con- 
dition of divisibility of f by ho means that the coefficients of ho are not too 
high. 

To determine ho, we use an algorithm for constructing a reduced basis of 
a lattice. Observe that the latter algorithm also has numerous applications 
that go beyond the problem of factorization of polynomials. 


8.1.1 The general description of the algorithm 


Let us pass directly to the description of the factorization algorithm. We 
assume that the polynomials f and f’ are relatively prime and cont(f) = 1. 
Let us start by computing the resultant R(f, f’) € Z. Let p be the least prime 
which does not divide R(f, f’). Then the degree of f (mod p) is equal to n 
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and the polynomials f (mod p) and f’ (mod p) are relatively prime. To prove 
this, it suffices to observe that R(f, f’) = 4a,D(f) (and therefore a, is not 
divisible by p) and R(f, f’!) = yf +vf', where y, wv € Zz]. 

The polynomial f (mod p) has no multiple divisors, and so it can be fac- 
torized into irreducible factors using Berlekamp’s algorithm. 

Let A (mod p) be one of the irreducible factors of f (mod p). In what 
follows we assume that h is a monic polynomial and the coefficients of h are 
reduced modulo p, i.e., lie between 0 and p— 1. 

Consider the factorization of f into irreducible factors over Z and pass from 
this factorization to a factorization modulo p. The polynomial f (mod p) has 
no multiple irreducible factors, and so the polynomial h (mod p) corresponds 
to precisely one irreducible factor hg of f over Z. Therefore there exists pre- 
cisely one irreducible factor ho of f over Z for which ho (mod p) is divisible 
by h (mod p). 

If one knows an algorithm which recovers hg from h, it is easy to factorize 
f. Indeed, let the factorization f = fi fo over Z be given, where the complete 
factorization of f; over Z is known and for fz the complete factorization 
modulo p is known (at the first step f; = 1 and fo = f). Take an irreducible 
divisor h (mod p) of f2 (mod p) and find the irreducible divisor ho of f2 over 
Z for which ho (mod p) is divisible by h (mod p). Replace f; by fiho and 
fo by fo/ho. After that we repeat this operation until we obtain a complete 
factorization of f. 

The algorithm which computes ho for a given h (mod p)will be described 
later in Section 8.1.3. This algorithm uses the algorithm that computes a 
reduced basis of the lattice. We therefore first discuss the notion of a reduced 
basis of a lattice and the algorithm of calculating the reduced basis (this 
algorithm was also suggested in [Le2], and therefore is also called an LLEL- 
algorithm). 


8.1.2 A reduced basis of the lattice 
A subset L C R” is called a lattice of rank n if there exists a basis b1,..., bn 
in R” such that 


i=l 


i=l 


The determinant of the lattice DL is the number 


d(L) = |det(bi,..., bn) 


2 


where b; denotes the column-vector of coordinates of the vector b;. The deter- 
minant of the lattice is equal to the volume of the parallelepiped spanned by 
b;,..-,0n. There are several bases of the lattice but the determinant of tran- 
sition from one basis to another is equal to +1, and so the number d(L) > 0 
does not depend on the choice of a basis. 
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Let b),..., 6, be a basis of L. The Gram—Schmidt orthogonalization yields 
an orthogonal (not necessarily orthonormal) basis 


i—1 
b;, b* 
b; =b;- ) Ligdj, where Mig = B 1 7 SGN. 
j=l eae) 


A basis 61,...,bn of the lattice L is called reduced if 


1 3 
[Liz | < a for 1 <7 <i<nand |b + ui 4-167_,|? > qleiaP forl<i<n. 


Theorem 8.1.1 (Hadamard’s inequality). The volume of a given paral- 
lelepiped does not exceed the product of the length of its edges, 1.e., 


L)< ][2- 
i=l 


Proof. The vectors 6; are orthogonal, and so 


i-1 
bil? = [DEP + SO wislO5? < (BFP. 
j=l 


Moreover, d(L) = [J |b; |. 
i=1 
In the following theorem we have collected the main inequalities for the 


vectors of a reduced basis. 


Theorem 8.1. a Let b,...,bn be a reduced basis of the lattice L. Then: 
a) d(L) < I a < 20 DIAd(L) . 


b) |b; <2 “1)/2/p8| forl<j<i<n. 
) |b: | < 9(n— aL ie 
d) Ifx€ L anda 40, then |by| < 2°-)/?|z]. 
e) If the vectors x1,...,x4 € L are linearly independent, then 


Jbj| < 2°-Y? max{|ar|,..., |e} forl <j <t. 


Proof. a) From the definition of a reduced basis it follows that 


iii, fo i i 
[oF |? 2 € — Wh; :) |b; le 2 5% il 


since |pi,i—1| < s. 
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A simple induction shows that |b;|? < 2'~4|b;|? for i > j. Therefore 


i-1 


4 ‘ ‘ 14+2+---4+2*-? caf ite 
ul? = EP + weg < ore (1+ EE) — ye (EE), 
j=l 
Hence 
“ 141142 142-37 
[[ i? ss... — —g— T[lr?s12-.. 2P-lg(Ly =o Dai). 


b) From the inequalities 


os)? < 2° 9|bF)? for i> j 


1+ 2) 
ae ol oP <2 1p? 


we deduce that 
|bj/? < 2*-9 . 29-4 )os/? = 2F-1YoF |? for 4 > j. 
c) Thanks to b) we have 
|b1|? < [oz/?, 
|bi|? < 2|3/?, 


|bs/? < 2"~*|bF,/?. 


The product of these inequalities yields 


\b1|2" < gn(n—1)/2 II ak _ ghar). 
w=1 


n 


n 
d) Express x in the form 7 = > rjbj = SO s;b¢, where r; € Z and s; € R. 
i=1 i=1 
Let io be the greatest index for which r; 4 0. Then r;, = s;,. Hence 


|x|? > 57, |b; 1? = ri, |bi,|? = [8;,|? = 2°" ba |? > 2°-™ |b |?. 
e) The vectors 21,...,2, cannot all belong to the subspace spanned by 
bi,...,b:-1. Hence, for a vector z;, we have |x,| > |b*|?, where i > t (see d)). 


Therefore, for 7 < 7, we obtain 


lazs|? > |B? > 27 *|B5|? S27" IT ||? = 2I~"|by|? > QI" |g |?.. 
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Let us now describe the algorithm for calculating a reduced basis of the 
lattice L. Suppose that the vectors 6;,...,b,%—1 form a reduced basis of the 
lattice they span (we start with one vector, ie., with k = 2). We adjoin 
to them a vector by, which belongs to LZ but does not lie in the subspace 


spanned by 0;,...,b,~-1. The construction of the basis of the lattice generated 
by b,,..., 6, is performed as follows. 
Step 1(Fulfilling the condition |,;| < 4.) Recall that pi; = ae. Start 


with | = k and suppose that |jxz;| < s for 1 <j <k. Replace b; by b, — qb1, 
where q is the integer nearest to yz. This transformation preserves pz; for 
j > U (since bF 1 b for 1 < j) and replaces pg; by Wei — since (by, bf) = (bf, Of). 
Clearly, |tx: — q| < 4, and so after such a modification of the b; the condition 
|14xj| < will be satisfied for 1 — 1 <j < k. Now repeat the operation. 

Step 2 (Fulfilling the condition |bj|? > ($ —Wg,,—1)|bg_1|?.) Suppose that 


* 3 * 
bE |? < Ci _ UE e—1) |OR_1 7. 


Then we replace the ordered set (b1,...,b4—2,be—-1, bg) by the ordered set 
(b1,...,0x—2, be, be-1). Under this change bf , will be replaced by bj + 
Hk,k-10%_,, and so |by_,|? will be replaced by 


bg? + UE 41 |ORal? < € = ha) [o5_1/? + Ep 1/0 al? = qlbi—al?. 


Let us consider the reduced basis b),...,b,-2 and apply the first step of the 
algorithm to it. Since |by_, |? decreases it follows that the algorithm converges 
(for the rigorous proof of the convergence of the algorithm, see [Le2] and 
[Co3]). 


8.1.3 The lattices and factorization of polynomials 


We recall that it remains to construct an algorithm which, for an irreducible 
divisor h (mod p) of the polynomial f (mod p) without multiple divisors, 
computes an irreducible divisor ho of f for which ho (mod p) is divisible by 
h (mod p). In doing so, we may assume that h is a monic polynomial. 

In intermediate calculations we have to consider divisibility modulo p*. 
We therefore consider a more general case: let us assume that h (mod p*) is 
an irreducible divisor of the polynomial f (mod p*) without multiple divisors, 
the polynomial fh is monic and ho is an irreducible divisor of f for which ho 
(mod p) is divisible by h (mod p). It is easy to verify that in this case ho 


(mod p*) is divisible by h (mod p*). Indeed, the polynomial L (mod p) is 
0 
not divisible by the irreducible polynomial h (mod p), i.e., these polynomials 


are relatively prime. Therefore 


Ah + pd =1-pv for some A, p,v € Z[2]. 
0 
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Let us multiply both sides of this equality by (1+ pv +---+p*~!v*-!)ho. As 
a result, we obtain Ayh + wf = ho (mod p*). But f (mod p*) is divisible by 
h (mod p*), and so ho (mod p*) is divisible by h (mod p*). 

Let n = deg f and / = degh. Fix an integer m > 1/ and consider the set of 
all polynomials with integer coefficients of degree not greater than m which 
are divisible by p* modulo h. As we have just shown, hg belongs to this set if 
deg ho <m. 

To g(z) = ag +--+: + am2™, we assign the point (ao,...,@m) € R™*?. 
Under this map the polynomials considered form a lattice L. Clearly, the 
norm |g| = \/>+ a? is just the Euclidean length in R™t?. 

The basis of the lattice L consists of the polynomials p* x’, where 0 < i < 1, 
and the polynomials h(x)x’, where 0 < 7 < m—I. The coordinates of these 
vectors (polynomials) relative to the basis 1,z,...,2’ constitute a matrix 

k 
of the form E ie i where J; is the / x 1 unit matrix and J’ is an upper 
triangular matrix with units on the main diagonal. Hence d(L) = p*". 

Before we start describing the algorithm for calculating the polynomial ho, 

we prove two theorems which provide the estimates we need. 


Theorem 8.1.3. If a polynomial b € L is such that |b|" «| f |" < p*, then b is 
divisible by ho. (In particular, (f,b) #1, where (f,b) is the greatest common 
divisor of f and b.) 


Proof. We may assume that b 4 0. Set g = (f,b). We would like to prove 
that g is divisible by ho. To this end, it suffices to prove that g (mod p) is 
divisible by h (mod p). Indeed, if g (mod p) is divisible by h (mod p) and 
f € Zia], then f is not divisible by ho because h (mod p) is a simple (of 


snuitiplicity 1) dese of f (mod p). Hence g is divisible by ho. 

Suppose that g (mod p) is not divisible by h (mod p). The polynomial h 
(mod p) is irreducible, and so g (mod p) and h (mod p) are relatively prime, 
ie., there exist polynomials \1, 41,71 € Zz] such that 


MA+ pig =1-— pr. (1) 
Let e = deg g and m’ = deg b. Clearly, 0 < e < m’! < m. Set 
M = {Af + ub | A, € Z[a], deg A < m/ — e, deg <n—e} 
CLAD epost ZG" 
Denote by M’ the image of M under the natural projection onto 
Di BP desi Tage 1, 


First we show that, if the image of Af + ub € M is equal to 0 € M’, then 
\ = p = 0. Indeed, in this case deg (Af + ub) < e, but Af + pb is divisible by g 
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b 
and deg g = e. Hence Af + wb = 0, ie., A (4) =—-p (2). The polynomials 


a 


— and - are relatively prime, and so p is divisible by —. But deg uw < n—e = 
g g g 


deg z Hence p = 0, and therefore \ = 0. 

Thus, the projections of the sets {a’ f | 0 <i< m’—e} and {2b |0<j< 
n —e} onto M’ are, on the one hand, linearly independent and, on the other 
hand, generate M’. This means that the projections of these two sets form a 
basis of the lattice M’. In particular, the rank of the lattice M’ is equal to 
n +m! —2e. Moreover, by Hadamard’s inequality d(M’) < | f|"™~¢|b|"~°. 

By the hypothesis, |f|"|b|" < p*. Since m’ < m, we have 


d(M') < |f ||" <p™. (2) 


Let us show that, if vy € M and degy <e+l, then p-*v € Z[z]. Indeed, the 
polynomial v = Af + yb is divisible by g = (f,b), and so having multiplied 
(1) by (1+ p+-:- te we obtain 


Agh + ph = . (mod p*). (3) 
We consider the situation when f (mod p*) is divisible by h (mod p*). Fur- 
ther, b € L, and so b (mod p*) is divisible by h (mod p”). Since v € M, it 
follows that v = Af + ub, so v (mod p*) is divisible by h (mod p*). Therefore 
(3) implies that v/g (mod p*) is also divisible by h (mod p*). But 
deg (- (aod *)) <e+l—e=l 


and h (mod p*) is a monic polynomial of degree |. Hence — = 0 (mod p*), 


oe / 


and therefore v = 0 (mod p*). 
In M’, we can select a basis be, be+1,---,0n4m/—e—1 Such that deg b; = 7. 
a 


It is easy to verify that e+1—1<n-+m/'-—e-1. Indeed, the polynomial b is 


divisible by g, and so e = deg g < degb = m’; also f (mod p) is divisible by 
g 
h (mod p), and so 
l= degh < deg f —degg=n-e. 


The elements be,..., be+1-1 € M’ are obtained from the polynomials that lie in 
M and are divisible by p* under the projection that annihilates terms of degree 
lower than e. Therefore all the coefficients of the polynomials be,...,be+i1-1 
(in particular, their highest coefficients) are divisible by p*. 

Clearly, the discriminant d(M"‘) is equal to the absolute value of the prod- 
uct of the highest coefficients of the polynomials be,..., On4+m:—e—1, and so it 
is not less than the product of the absolute values of the highest coefficients 
of be,..-,be+1-1. Hence d(M’) > p*. This contradicts inequality (2). 
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In the next theorem we, as earlier, assume that h is a monic polynomial 
and f (mod p*) is divisible by h (mod p*); L is the lattice of polynomials 
of degree not higher than m and divisible by h modulo p*; | = degh and 
n = deg f; ho is an irreducible divisor of f for which ho (mod p) is divisible 
by h (mod p), i.e., ho (mod p*) is divisible by h (mod p*). 


Theorem 8.1.4. Let bi, b2,...,bm41 be a reduced basis of the lattice L. Sup- 


pose that 
kl /2 2m)\"/? + 
4 > ium m _ 
P ( m ) If 


a) Then deg ho < m if and only if 


kl 

n/| PB 
eyo 
gg 


b) Suppose that, for a basis vector b;, we have 


kl 
|b;| < Vie (*) 


Let t be the greatest of all such indices 7. Then 
deghhb =m+1-t; ho = GCD(by,..., be); 


and inequality (*) holds for j =1,...,t. 


kl 
Proof. a) First, suppose that |bi|< / Tae ie., |bi|” -|f |" < p™. Then 
by Theorem 8.1.3, the polynomial 6; € LF is divisible by ho. On the other 
hand, the condition 6; € DL implies that deg b; < m. Thus deg hg < m. 

Now suppose that degho < m, i.e., hg € L. By Corollary of Mignotte’s 


theorem (see page 154), |ho| < Cal. By applying Theorem 8.1.2 (d) for 


x = ho, we obtain 
m/2 m/2 2m 
by <2Ihol < 2/2, /(7V py, (a) 


By the hypothesis, 2"/2(2™)"/"| fin < BY ie, 
IM pe 
om? ta, 5 
Cryini< JE (5) 


kl 
Formulas (1) and (5) yield the inequality desired: |bi| < { re 
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b) Let J be the set of all the indices 7 for which (*) holds. By Theorem 
8.1.3, if 7 € J, then b; is divisible by ho. Therefore hy = GCD(b; | 7 € J) is 
divisible by ho. Here if 7 € J, then 6; is divisible by hy and deg b; < m, ice., 
b; belongs to the lattice 


Tio he HDs ye ton + Zhe 
Since the vectors b1,...,bm are linearly independent, it follows that 
|J| <<m-+1-degh, (6) 


where |.J| is the cardinality of J. 
By Corollary of Mignotte’s theorem (see page 154), we have 


oz] = Ihol <4) (3 Jiri for any i > 0. 


By definition, i = 0,1,...,m —degho for hox’ € L and these vectors are 
linearly independent. Theorem 8.1.2 (e) is therefore applicable to them: 


. 2 
bj] < 2/7 |Rgari] < 2/2 (yi be ge a eee, 


[kl 
By the hypothesis, 2"/?,/(7"")|f| < ? Tice and so 


{1,2,...,;m+1—degho} Cc J. 
Since hy is divisible by ho, it follows that deg h; < deg ho, and therefore 
|J| >m+1-—degho > m+ 1 —deg hy. (7) 


By comparing the inequalities (6) and (7) we see that degho = degh, = t 
and J = {1,2,...,t}. 
It remains to verify that ho = +h 1. Since ho is a divisor of the polynomial 
f with content 1, it follows that cont(ho) = 1. Let 7 € J and d; = cont(b;). 
By Theorem 8.1.3, the polynomial 6; is divisible by ho. Therefore a is also 
J 


b. 
divisible by ho. Since ho € L, we deduce that 7s € L. But 6; is a basis 


j 
element of the lattice L, and so d; = 1. This means that cont(h;) = 1, since 
b; is divisible by h;. Thus h, is divisible by ho and cont(h,) = 1, and so 
ho = £hy. 


Now we are able to describe the algorithm for calculating the polynomial 
ho. 
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An auxiliary algorithm. (For a fixed m, this algorithm verifies whether 
or not deg hg < m. If this is true, then it calculates the polynomial ho.) 

INITIAL DATA: 
e polynomial f of degree n; 
e aprime p; 
e a positive integer k; 
e amonic polynomial h for which f (mod p*) is divisible by h (mod p*); 
the polynomial h (mod p) is irreducible and f (mod p) is not divisible by 
h? (mod p). 

We also assume that the coefficients of h are reduced modulo p*, in 
other words, lie between 0 and p* — 1 (here |h|? < 1 + Ip?*.) 
e a positive integer m > 1 = degh such that 


kl mn/2 2m ae mtn 
p’ >2 fre. (8) 
m 


ALGORITHM’S PERFORMANCE. For a lattice L with basis 


{p'a' |0<i< DU fh Fa |0<j<m-J, 


we find a reduced basis b1,... , bm+1. 
kl 
If |bi] > 7 ie then deg ho > m and the algorithm stops. 
pe 
If jbi] < ¢ Fe then degho < m and ho = GCD(b,,...,b¢), where t is 


determined in the formulation of Theorem 8.1.4 (b). 

Main algorithm. (The algorithm calculates an irreducible divisor ho of 
f such that ho (mod p) is divisible by h (mod p)). 

We may assume that 1 = degh < deg f = n and the coefficients of h are 
reduced modulo jp, i.e., lie between 0 and p— 1. 

ALGORITHM’S PERFORMANCE. First, we compute the least positive integer 
k; for which inequality (8) holds for m =n —1: 


n/2 
: aS 2(n-—1 ne 
pr 7) ( ca eo) iF? 1 


Next, for the factorization f = hg (mod p), we calculate its Henzel’s lift 
f =hg (mod p*), where k is the positive integer just computed. Here h = h 
(mod p) and the coefficients of h are assumed to be reduced modulo p*. 


Let u be the largest integer for which | < no. We consecutively execute 


2 2 
compute ho. If we are unable to compute ho, then deg hy > n—1 and ho = f, 
i.e., f is irreducible. 


the auxiliary algorithm for m = Ea ‘ Es 5 ads a , m—1 until we 
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